#23750 - 25.0.95; bug in url-retrieve or json.el

GNU bug report logs - #23750
25.0.95; bug in url-retrieve or json.el

Package: emacs;

Reported by: Leo Liu <sdl.web <at> gmail.com>

Date: Sun, 12 Jun 2016 02:24:02 UTC

Severity: normal

Found in version 25.0.95

Done: Dmitry Gutov <dgutov <at> yandex.ru>

Bug is archived. No further changes may be made.

Message #77 received at 23750 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 23750 <at> debbugs.gnu.org, monnier <at> IRO.UMontreal.CA, sdl.web <at> gmail.com Subject: Re: bug#23750: 25.0.95; bug in url-retrieve or json.el Date: Mon, 20 Jun 2016 17:54:23 +0300

On 06/20/2016 05:38 PM, Eli Zaretskii wrote: > This all sounds like my response is not welcome, but in that case why > did you ask the question? I was kind of hoping for "yes, let's get it into 25.1!"? :) > No, the bug is where the invalid input is generated in the first > place. Each API has its contract; if you violate the contract, you > invoke undefined behavior. It's a bug in the API, or bad API, if you will. It needs stricter contract, and the submitted patch added it. Or to look at it another way, the current contract allows url-http-data to be multibyte, because the requirement to the contrary is not documented anywhere that I can see. The variable is simply undocumented. >> If this is what you need, why not simply test the payload for being a >> unibyte string? There a function, multibyte-string-p, for that. >> >> There are a lot of variables to test (see the comment above the mapconcat call). > > Looks like mapc will be able to deal with that. Or just use concat, > and test the result with multibyte-string-p before sending. Or encode > it with UTF-8, if it is not unibyte already. I don't know if we want to be that permissive that we'll encode to UTF-8 silently. > Btw, I don't think the comment which explains why we started using > mapconcat is accurate these days. It was written before the move to > Unicode in Emacs 23, but we stopped converting raw bytes into Latin-1 > characters in Emacs 23 and later. So maybe we should just go back to > using concat (with erroring out, if the result is multibyte, and/or > maybe with replacing 'length' with 'string-bytes'). Better error out: the payload's encoding is something only the caller should be concerned with. Unless we're fine with the users assuming that Emacs's internal encoding is close enough to UTF-8. > Bottom line: like I said, there should be no reason to use > string-*-unibyte in modern Emacs code on the url-http level or higher > (maybe not at all). Its use is a sign of some basic misunderstanding, > or a bug elsewhere, or remnant of old problems that no longer exist. > So I think we should reconsider the solution on master as well. I don't mind. Would you advocate for having this fix on emacs-25 if I implement it the way you described? >> And you'll have to come up with the error message(s). > > Are you saying you like the error message from string-to-unibyte? > > Cannot convert 123th character to unibyte It's an order of magnitude better than what was before (no error and silent corruption), but yes, there is space for improvement.

This bug report was last modified 9 years and 47 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #23750 25.0.95; bug in url-retrieve or json.el

GNU bug report logs - #23750
25.0.95; bug in url-retrieve or json.el