GNU bug report logs -
#24117
25.1; url-http-create-request: Multibyte text in HTTP request
Previous Next
Reported by: Sho Takemori <stakemorii <at> gmail.com>
Date: Sun, 31 Jul 2016 08:28:02 UTC
Severity: normal
Found in version 25.1
Done: Dmitry Gutov <dgutov <at> yandex.ru>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
On 08/09/2016 05:50 PM, Eli Zaretskii wrote:
>> You can't encode it properly without parsing it first.
>
> You don't say what you meant by "encode properly". It's just a
> string, and there are ways to make a string unibyte without any
> parsing.
Different parts of an URL are supposed to be encoded in different ways.
For instance,
http://банки.рф/фыва/
turns into
http://xn--80abwho.xn--p1ai/%D1%84%D1%8B%D0%B2%D0%B0/
The domain is encoded with IDNA, whereas the path uses percent-encoding.
And they're also often encoded separately (e.g. when you copy-paste the
above URL from Firefox to a text editor, the result is
http://банки.рф/%D1%84%D1%8B%D0%B2%D0%B0/).
So I think the encoding of the URL parts should be performed inside
url-http-create-request. On the master branch, host is passed through
IDNA encoding, but real-fname is untouched. On emacs-25, I think we
should convert both to unibyte.
Not sure encode-coding-string is the way to go (why would we assume
UTF-8?). Personally, using string-as-unibyte makes more sense (neither
string should contain any multibyte characters at that point), but I
defer to the more qualified colleagues.
(Why doesn't (encode-coding-string "aaaa" 'ascii) work?)
This bug report was last modified 8 years and 12 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.