#24117 - 25.1; url-http-create-request: Multibyte text in HTTP request

GNU bug report logs - #24117
25.1; url-http-create-request: Multibyte text in HTTP request

Package: emacs;

Reported by: Sho Takemori <stakemorii <at> gmail.com>

Date: Sun, 31 Jul 2016 08:28:02 UTC

Severity: normal

Found in version 25.1

Done: Dmitry Gutov <dgutov <at> yandex.ru>

Bug is archived. No further changes may be made.

View this message in rfc822 format

From: Dmitry Gutov <dgutov <at> yandex.ru> To: Eli Zaretskii <eliz <at> gnu.org> Cc: stakemorii <at> gmail.com, larsi <at> gnus.org, 24117 <at> debbugs.gnu.org Subject: bug#24117: 25.1; url-http-create-request: Multibyte text in HTTP request Date: Mon, 8 Aug 2016 04:56:58 +0300

Hi Eli, On 08/04/2016 08:02 PM, Eli Zaretskii wrote: > Hmm, but url-generic-parse-url is called from gazillion other places, > so maybe this is not safe. Only about 40 places, all of them either in lisp/url or lisp/gnus. Sadly, Lars is being silent on the matter. It might not be 100% safe, but maybe doing TRT could be enough. > No, I meant that since RFC 3986 doesn't allow non-ASCII characters, Indeed. > and url-generic-parse-url doesn't do anything about that, it is either > already broken for non-ASCII characters, or already copes with them. > So we don't need to worry about that. I imagined that some code that uses the return value of url-http-create-request might perform the escaping. But that doesn't seem to be the case, see below. > However, a safer change would be to do something like this: > > (or (not (multibyte-string-p url-http-target-url)) > (setq url-http-target-url > (decode-coding-string url-http-target-url 'utf-8))) > > in url-http-create-request. Can you try this? I'll try it if you insist, but that choice of encoding seems rather arbitrary. I think we should go with your previous suggestion: make the URL parsing buffer unibyte. But we do try to handle non-ASCII URLs on the level above url-generic-parse-url. See url-retrieve-internal: one of the first things it does is (setq url (url-encode-url url)). And only after that, (setq url (url-generic-parse-url url)). The URL package doesn't seem to support international domains anyway. This fails: (url-retrieve-synchronously "http://банки.рф") However, the error it fails with is a bit more comprehensible if the URL parsing buffer is unibyte: Debugger entered--Lisp error: (error "банки.рф/80 Name or service not known") Instead of: Debugger entered--Lisp error: (error "\301\220\300\261\301\220\300\260\301\220\300\275\301\220\300\272\301\220\300\270.\301\221\300\200\301\221\300\204/80 Name or service not known")

This bug report was last modified 8 years and 65 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #24117 25.1; url-http-create-request: Multibyte text in HTTP request

GNU bug report logs - #24117
25.1; url-http-create-request: Multibyte text in HTTP request