GNU bug report logs - #24117
25.1; url-http-create-request: Multibyte text in HTTP request

Previous Next

Package: emacs;

Reported by: Sho Takemori <stakemorii <at> gmail.com>

Date: Sun, 31 Jul 2016 08:28:02 UTC

Severity: normal

Found in version 25.1

Done: Dmitry Gutov <dgutov <at> yandex.ru>

Bug is archived. No further changes may be made.

Full log


Message #101 received at 24117 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: stakemorii <at> gmail.com, 24117 <at> debbugs.gnu.org
Subject: Re: bug#24117: 25.1;
 url-http-create-request: Multibyte text in HTTP request
Date: Tue, 09 Aug 2016 11:39:20 +0200
Dmitry Gutov <dgutov <at> yandex.ru> writes:

> Here's another question: why does url-encode-url pass the argument
> through encode-coding-string before passing it to
> url-generic-parse-url, if the latter is expected to be able to deal
> with non-ASCII characters?

I don't know.  I don't think `url-encode-url' has ever really worked in
any sensible way in the presence of non-ASCII.

> The only recent change in that function is your commit 8b61c22e dated
> last December, which very much looks like a band-aid in this context.

It's debatable what that function should return in the presence of
non-ASCII domain names, but it's a debatable function all around.

> Since you're better versed in this area than me, can you propose a
> specific fix for the currently discussed bug? It is more serious than
> not being able to use unicode in URLs.

I didn't understand the original bug report and there was no simple
recipe to reproduce the bug.  Why changing url-generic-parse-url was
proposed as a solution is even less unclear.  Perhaps you could write a
test case and summarise what you think the problem is?

> On master, the domain part, which is untouched by url-encode-url, is
> converted to an ASCII unibyte string with puny-encode-domain, inside
> url-http-create-request. But real-fname remains a multibyte string,
> triggering the problem anyway.

The domain is encoded according to IDNA, which is an ASCII string, yes.
(Whether the function returns a unibyte string or not I can't recall.)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




This bug report was last modified 8 years and 12 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.