GNU bug report logs -
#78984
31.0.50; `url-build-query-string' fails to escape a literal `%' in keys and values
Previous Next
Reported by: Steven Allen <steven <at> stebalien.com>
Date: Wed, 9 Jul 2025 21:30:04 UTC
Severity: normal
Found in version 31.0.50
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
> From: Steven Allen <steven <at> stebalien.com>
> Cc: 78984 <at> debbugs.gnu.org
> Date: Sat, 12 Jul 2025 08:51:38 -0700
>
> Eli Zaretskii <eliz <at> gnu.org> writes:
>
> >> Date: Wed, 09 Jul 2025 14:29:17 -0700
> >> From: Steven Allen via "Bug reports for GNU Emacs,
> >> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> >>
> >>
> >> `url-build-query-string' fails to escape literal `%' characters and keys
> >> and values. To reproduce, run the following and note that it's encoded
> >> as "%%3D" when it should be encoded as "%25%3D".
> >>
> >> emacs --batch --eval '(message "%s" (url-build-query-string (list (list "key" "%="))))'
> >>
> >> The root cause appears to be that `%' is explicitly allowed in
> >> `url-host-allowed-chars' (inherited by
> >> `url-query-key-value-allowed-chars'), apparently to avoid re-encoding
> >> %-encoded sequences. Unfortunately, changing this will definitely break
> >> things (e.g., `url-encode-url' will no longer round-trip).
> >>
> >> The direct/simple fix would be to forbid `%' in
> >> `url-query-key-value-allowed-chars'.
> >>
> >> This issue also appears to affect Eglot's `eglot-path-to-uri' function
> >> (`eglot--uri-path-allowed-chars' allows `%' in path-names),
> >> but I haven't been able to find a stand-alone reproducer.
> >
> > How about documenting that url-build-query-string should already have
> > any literal % characters encoded as %25, and changing all the callers
> > to abide by that requirement?
>
> That sounds like a reasonable solution for `url-host-allowed-chars',
> `url-path-allowed-chars', `url-query-allowed-chars', and
> `url-query-key-value-allowed-chars':
>
> 1. Allowing `%' is "correct" as `%' is, in fact, allowed in URLs.
>
> 2. These variables are used in conjunction with `url-hexify-string' to
> escape strings that may already be partially %-encoded.
>
> 3. From what I can tell, the documentation in these cases is technically
> correct already, if a bit misleading.
>
> However, IMO, `url-build-query-string' should escape everything,
> including literal `%' characters:
>
> 1. `url-parse-query-string', the inverse of `url-build-query-string',
> decodes %-encoded strings. In order for these two functions to
> round-trip, `url-build-query-string' needs to escape `%'. Specifically,
> the following test should pass:
>
> (ert-deftest url-query-string-round-trips ()
> (let* ((query-string "foo=%25%3D")
> (parsed-query (url-parse-query-string query-string))
> (round-trip (url-build-query-string parsed-query)))
> (should (equal round-trip query-string))))
>
> 2. `url-build-query-string' is a higher-level tool to transform structured
> data into a query string. IMO, the user shouldn't ahve to think about
> pre-escaping their strings here.
>
> I'm happy to provide patches (both to improve the docs and fix this
> issue) if you think this is a reasonable approach.
The situation you described sounds like a contradiction to me: some
use cases need url-build-query-string to have literal % encoded as
%25, and others need the literal % NOT encoded. In particular, you
originally said that the root cause of the problem was that
url-path-allowed-chars allows '%', but now you say that
url-build-query-string should encode literal '%', although
url-path-allowed-chars is okay as it is? So I don't understand what
kind of a solution you have in mind; perhaps I'm missing something?
I guess I'm confused.
Thanks.
This bug report was last modified 12 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.