GNU bug report logs - #78984
31.0.50; `url-build-query-string' fails to escape a literal `%' in keys and values

Previous Next

Package: emacs;

Reported by: Steven Allen <steven <at> stebalien.com>

Date: Wed, 9 Jul 2025 21:30:04 UTC

Severity: normal

Found in version 31.0.50

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #20 received at 78984 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Steven Allen <steven <at> stebalien.com>
Cc: 78984 <at> debbugs.gnu.org
Subject: Re: bug#78984: 31.0.50; `url-build-query-string' fails to escape a
 literal `%' in keys and values
Date: Sat, 26 Jul 2025 12:05:32 +0300
> From: Steven Allen <steven <at> stebalien.com>
> Cc: 78984 <at> debbugs.gnu.org
> Date: Thu, 17 Jul 2025 13:01:45 -0700
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> > I guess I'm confused.
> >
> > Thanks.
> 
> I apologize, I shouldn't have called that the root cause.
> `url-path-allowed-chars' allowing % isn't incorrect, it's just the
> ultimate reason `url-build-query-string' passes through literal %
> characters without encoding them. As a matter of fact, this
> (`url-path-allowed-chars') cannot be changed without breaking
> `url-path-and-query' and `url-encode-url' as I will explain below:
> 
> First, `url-path-and-query' must not %-decode the path and the query as
> that would make it impossible to parse them correctly. E.g.,
> "/some%2fpath" must not be decoded as "/some/path" before splitting the
> path into components and the same goes for query strings with respect to
> & and =. Specifically, I'm referring to the following in `url-encode-url':
> 
>     (let* ...
>          (path-and-query (url-path-and-query obj))
> 
> Therefore, `url-encode-url' must not %-encode literal % characters when
> encoding the path and query because they were not (nor could they have
> been) %-decoded by `url-path-and-query'. Specifically, I'm referring to
> the following in `url-encode-url':
> 
>     (if path
> 	(setq path (url-hexify-string path url-path-allowed-chars)))
>     (if query
> 	(setq query (url-hexify-string query url-query-allowed-chars)))
> 
> That's why `url-path-allowed-chars' and `url-query-allowed-chars' must
> allow literal % characters without re-encoding them.
> 
> However, `url-build-query-string' and `url-parse-query-string' are
> different. Unlike `url-path-and-query', `url-parse-query-string' decodes
> the query into a structured object (a key/values alist) and can (and
> does) completely %-decode the keys & values. That means the inverse,
> `url-build-query-string', should assume that its inputs are not
> already %-encoded and should %-encode literal % characters.
> 
> I've attached a patch to show you what I mean. In this patch, I
> introduced a new `url--query-key-value-preserved-chars' constant,
> separate from `url-query-key-value-allowed-chars', to keep
> `url-query-key-value-allowed-chars' consistent with the other
> `-allowed-chars' constants. That is, while % is ALLOWED to appear in
> encoded query keys and values, it shouldn't be PRESERVED when %-escaping
> them.

This patch is fine.  If it is a final version and passes all the
tests, I will install it.  If you have further updates, please post,
and I will install then.

Thanks.




This bug report was last modified 12 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.