GNU bug report logs - #24117
25.1; url-http-create-request: Multibyte text in HTTP request

Previous Next

Package: emacs;

Reported by: Sho Takemori <stakemorii <at> gmail.com>

Date: Sun, 31 Jul 2016 08:28:02 UTC

Severity: normal

Found in version 25.1

Done: Dmitry Gutov <dgutov <at> yandex.ru>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: stakemorii <at> gmail.com, larsi <at> gnus.org, 24117 <at> debbugs.gnu.org
Subject: bug#24117: 25.1; url-http-create-request: Multibyte text in HTTP request
Date: Mon, 8 Aug 2016 04:56:58 +0300
Hi Eli,

On 08/04/2016 08:02 PM, Eli Zaretskii wrote:

> Hmm, but url-generic-parse-url is called from gazillion other places,
> so maybe this is not safe.

Only about 40 places, all of them either in lisp/url or lisp/gnus. 
Sadly, Lars is being silent on the matter.

It might not be 100% safe, but maybe doing TRT could be enough.

> No, I meant that since RFC 3986 doesn't allow non-ASCII characters,

Indeed.

> and url-generic-parse-url doesn't do anything about that, it is either
> already broken for non-ASCII characters, or already copes with them.
> So we don't need to worry about that.

I imagined that some code that uses the return value of 
url-http-create-request might perform the escaping. But that doesn't 
seem to be the case, see below.

> However, a safer change would be to do something like this:
>
>    (or (not (multibyte-string-p url-http-target-url))
>        (setq url-http-target-url
>              (decode-coding-string url-http-target-url 'utf-8)))
>
> in url-http-create-request.  Can you try this?

I'll try it if you insist, but that choice of encoding seems rather 
arbitrary. I think we should go with your previous suggestion: make the 
URL parsing buffer unibyte.

But we do try to handle non-ASCII URLs on the level above 
url-generic-parse-url. See url-retrieve-internal: one of the first 
things it does is (setq url (url-encode-url url)). And only after that, 
(setq url (url-generic-parse-url url)).

The URL package doesn't seem to support international domains anyway. 
This fails:

(url-retrieve-synchronously "http://банки.рф")

However, the error it fails with is a bit more comprehensible if the URL 
parsing buffer is unibyte:

Debugger entered--Lisp error: (error "банки.рф/80 Name or service not 
known")

Instead of:

Debugger entered--Lisp error: (error 
"\301\220\300\261\301\220\300\260\301\220\300\275\301\220\300\272\301\220\300\270.\301\221\300\200\301\221\300\204/80 
Name or service not known")




This bug report was last modified 8 years and 12 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.