GNU bug report logs -
#78984
31.0.50; `url-build-query-string' fails to escape a literal `%' in keys and values
Previous Next
Reported by: Steven Allen <steven <at> stebalien.com>
Date: Wed, 9 Jul 2025 21:30:04 UTC
Severity: normal
Found in version 31.0.50
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
Message #20 received at 78984 <at> debbugs.gnu.org (full text, mbox):
> From: Steven Allen <steven <at> stebalien.com>
> Cc: 78984 <at> debbugs.gnu.org
> Date: Thu, 17 Jul 2025 13:01:45 -0700
>
> Eli Zaretskii <eliz <at> gnu.org> writes:
> > I guess I'm confused.
> >
> > Thanks.
>
> I apologize, I shouldn't have called that the root cause.
> `url-path-allowed-chars' allowing % isn't incorrect, it's just the
> ultimate reason `url-build-query-string' passes through literal %
> characters without encoding them. As a matter of fact, this
> (`url-path-allowed-chars') cannot be changed without breaking
> `url-path-and-query' and `url-encode-url' as I will explain below:
>
> First, `url-path-and-query' must not %-decode the path and the query as
> that would make it impossible to parse them correctly. E.g.,
> "/some%2fpath" must not be decoded as "/some/path" before splitting the
> path into components and the same goes for query strings with respect to
> & and =. Specifically, I'm referring to the following in `url-encode-url':
>
> (let* ...
> (path-and-query (url-path-and-query obj))
>
> Therefore, `url-encode-url' must not %-encode literal % characters when
> encoding the path and query because they were not (nor could they have
> been) %-decoded by `url-path-and-query'. Specifically, I'm referring to
> the following in `url-encode-url':
>
> (if path
> (setq path (url-hexify-string path url-path-allowed-chars)))
> (if query
> (setq query (url-hexify-string query url-query-allowed-chars)))
>
> That's why `url-path-allowed-chars' and `url-query-allowed-chars' must
> allow literal % characters without re-encoding them.
>
> However, `url-build-query-string' and `url-parse-query-string' are
> different. Unlike `url-path-and-query', `url-parse-query-string' decodes
> the query into a structured object (a key/values alist) and can (and
> does) completely %-decode the keys & values. That means the inverse,
> `url-build-query-string', should assume that its inputs are not
> already %-encoded and should %-encode literal % characters.
>
> I've attached a patch to show you what I mean. In this patch, I
> introduced a new `url--query-key-value-preserved-chars' constant,
> separate from `url-query-key-value-allowed-chars', to keep
> `url-query-key-value-allowed-chars' consistent with the other
> `-allowed-chars' constants. That is, while % is ALLOWED to appear in
> encoded query keys and values, it shouldn't be PRESERVED when %-escaping
> them.
This patch is fine. If it is a final version and passes all the
tests, I will install it. If you have further updates, please post,
and I will install then.
Thanks.
This bug report was last modified 12 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.