GNU bug report logs - #6252
Emacs does not implement URL (aka "percent") decoding correctly.

Previous Next

Package: emacs;

Reported by: José A. Romero L. <escherdragon <at> gmail.com>

Date: Sun, 23 May 2010 00:52:02 UTC

Severity: normal

Tags: fixed

Fixed in version 24.2

Done: Lars Magne Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>
To: José A. Romero L. <escherdragon <at> gmail.com>
Cc: 6252 <at> debbugs.gnu.org
Subject: bug#6252: Emacs does not implement URL (aka "percent") decoding	correctly.
Date: Mon, 24 May 2010 12:33:46 +0900
>>>>> On Sun, 23 May 2010 01:46:54 +0200, José A. Romero L. <escherdragon <at> gmail.com> said:

> Seems that RFC 3986 has not been implemented correctly in
> Emacs. IMHO that is an important hole you have found there. The
> standard requires that all unreserved characters be encoded/decoded
> as UTF8 bytes.

If you are referring to the following part of RFC 3986, it doesn't say
anything about existing URI schemes (as opposed to "a new URI
scheme"), those defining a component that does NOT represent textual
data, or even for textual data, those NOT consisting of characters
from the Universal Character Sets.

  When a new URI scheme defines a component that represents textual
  data consisting of characters from the Universal Character Set
  [UCS], the data should first be encoded as octets according to the
  UTF-8 character encoding [STD63]; then only those octets that do not
  correspond to characters in the unreserved set should be percent-
  encoded.

(See also http://lists.gnu.org/archive/html/emacs-devel/2006-08/msg00065.html)

Though returning a multibyte string decoded as UTF-8 would be useful
for many cases, I think some "unhex"ing function should also provide a
functionality to return a unibyte string.

				     YAMAMOTO Mitsuharu
				mituharu <at> math.s.chiba-u.ac.jp




This bug report was last modified 13 years and 104 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.