GNU bug report logs - #78984
31.0.50; `url-build-query-string' fails to escape a literal `%' in keys and values

Previous Next

Package: emacs;

Reported by: Steven Allen <steven <at> stebalien.com>

Date: Wed, 9 Jul 2025 21:30:04 UTC

Severity: normal

Found in version 31.0.50

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: Steven Allen <steven <at> stebalien.com>
Cc: 78984 <at> debbugs.gnu.org
Subject: bug#78984: 31.0.50; `url-build-query-string' fails to escape a literal `%' in keys and values
Date: Thu, 17 Jul 2025 09:34:39 +0300
> From: Steven Allen <steven <at> stebalien.com>
> Cc: 78984 <at> debbugs.gnu.org
> Date: Sat, 12 Jul 2025 08:51:38 -0700
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> >> Date: Wed, 09 Jul 2025 14:29:17 -0700
> >> From:  Steven Allen via "Bug reports for GNU Emacs,
> >>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> >> 
> >> 
> >> `url-build-query-string' fails to escape literal `%' characters and keys
> >> and values. To reproduce, run the following and note that it's encoded
> >> as "%%3D" when it should be encoded as "%25%3D".
> >> 
> >>     emacs --batch --eval '(message "%s" (url-build-query-string (list (list "key" "%="))))'
> >> 
> >> The root cause appears to be that `%' is explicitly allowed in
> >> `url-host-allowed-chars' (inherited by
> >> `url-query-key-value-allowed-chars'), apparently to avoid re-encoding
> >> %-encoded sequences. Unfortunately, changing this will definitely break
> >>   things (e.g., `url-encode-url' will no longer round-trip).
> >> 
> >> The direct/simple fix would be to forbid `%' in
> >> `url-query-key-value-allowed-chars'.
> >> 
> >> This issue also appears to affect Eglot's `eglot-path-to-uri' function
> >> (`eglot--uri-path-allowed-chars' allows `%' in path-names),
> >> but I haven't been able to find a stand-alone reproducer.
> >
> > How about documenting that url-build-query-string should already have
> > any literal % characters encoded as %25, and changing all the callers
> > to abide by that requirement?
> 
> That sounds like a reasonable solution for `url-host-allowed-chars',
> `url-path-allowed-chars', `url-query-allowed-chars', and 
> `url-query-key-value-allowed-chars':
> 
> 1. Allowing `%' is "correct" as `%' is, in fact, allowed in URLs.
> 
> 2. These variables are used in conjunction with `url-hexify-string' to
> escape strings that may already be partially %-encoded.
> 
> 3. From what I can tell, the documentation in these cases is technically
> correct already, if a bit misleading.
> 
> However, IMO, `url-build-query-string' should escape everything, 
> including literal `%' characters:
> 
> 1. `url-parse-query-string', the inverse of `url-build-query-string',
> decodes %-encoded strings. In order for these two functions to
> round-trip, `url-build-query-string' needs to escape `%'. Specifically,
> the following test should pass:
> 
>     (ert-deftest url-query-string-round-trips ()
>       (let* ((query-string "foo=%25%3D")
>              (parsed-query (url-parse-query-string query-string))
>              (round-trip (url-build-query-string parsed-query)))
>         (should (equal round-trip query-string))))
> 
> 2. `url-build-query-string' is a higher-level tool to transform structured
> data into a query string. IMO, the user shouldn't ahve to think about
> pre-escaping their strings here.
> 
> I'm happy to provide patches (both to improve the docs and fix this
> issue) if you think this is a reasonable approach.

The situation you described sounds like a contradiction to me: some
use cases need url-build-query-string to have literal % encoded as
%25, and others need the literal % NOT encoded.  In particular, you
originally said that the root cause of the problem was that
url-path-allowed-chars allows '%', but now you say that
url-build-query-string should encode literal '%', although
url-path-allowed-chars is okay as it is?  So I don't understand what
kind of a solution you have in mind; perhaps I'm missing something?

I guess I'm confused.

Thanks.




This bug report was last modified 12 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.