#24117 - 25.1; url-http-create-request: Multibyte text in HTTP request

GNU bug report logs - #24117
25.1; url-http-create-request: Multibyte text in HTTP request

Package: emacs;

Reported by: Sho Takemori <stakemorii <at> gmail.com>

Date: Sun, 31 Jul 2016 08:28:02 UTC

Severity: normal

Found in version 25.1

Done: Dmitry Gutov <dgutov <at> yandex.ru>

Bug is archived. No further changes may be made.

Message #110 received at 24117 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru> To: Lars Ingebrigtsen <larsi <at> gnus.org> Cc: stakemorii <at> gmail.com, 24117 <at> debbugs.gnu.org Subject: Re: bug#24117: 25.1; url-http-create-request: Multibyte text in HTTP request Date: Wed, 10 Aug 2016 09:50:16 +0300

On 08/09/2016 12:39 PM, Lars Ingebrigtsen wrote: > I don't know. I don't think `url-encode-url' has ever really worked in > any sensible way in the presence of non-ASCII. My point is, you're saying that url-generic-parse-url should accept (and handle properly) multibyte URLs. But url-encode-url still encodes the URL string before passing it to url-generic-parse-url. > It's debatable what that function should return in the presence of > non-ASCII domain names, but it's a debatable function all around. The way the version in master works makes quite a bit of sense to me. > I didn't understand the original bug report and there was no simple > recipe to reproduce the bug. Why changing url-generic-parse-url was > proposed as a solution is even less unclear. Perhaps you could write a > test case and summarise what you think the problem is? Please try this: (with-current-buffer (let ((url-request-data (encode-coding-string "фыва" 'utf-8))) (url-retrieve-synchronously "http://posttestserver.com/post.php")) (buffer-string)) You'll get the "Multibyte text in HTTP request" error, which was added in a98aa02a5dbf079f7b4f3be5487a2f2b741d103d, to validate request data and make sure that users don't have to spend too much time investigating problems in their own code like bug#23750. But the added validation relied on the assumption that the situation with multibyte/unibyte strings that url-http-create-request acts on is somewhat sane, which is not true, as the current discussion has demonstrated. So we either need to straighten it up, or change the validation logic. If everything fails, of course, we can revert the aforementioned commit, but that would be bad for users. >> On master, the domain part, which is untouched by url-encode-url, is >> converted to an ASCII unibyte string with puny-encode-domain, inside >> url-http-create-request. But real-fname remains a multibyte string, >> triggering the problem anyway. > > The domain is encoded according to IDNA, which is an ASCII string, yes. > (Whether the function returns a unibyte string or not I can't recall.) It does.

This bug report was last modified 8 years and 64 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #24117 25.1; url-http-create-request: Multibyte text in HTTP request

GNU bug report logs - #24117
25.1; url-http-create-request: Multibyte text in HTTP request