GNU bug report logs - #78984
31.0.50; `url-build-query-string' fails to escape a literal `%' in keys and values

Previous Next

Package: emacs;

Reported by: Steven Allen <steven <at> stebalien.com>

Date: Wed, 9 Jul 2025 21:30:04 UTC

Severity: normal

Found in version 31.0.50

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #11 received at 78984 <at> debbugs.gnu.org (full text, mbox):

From: Steven Allen <steven <at> stebalien.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 78984 <at> debbugs.gnu.org
Subject: Re: bug#78984: 31.0.50; `url-build-query-string' fails to escape a
 literal `%' in keys and values
Date: Sat, 12 Jul 2025 08:51:38 -0700
Eli Zaretskii <eliz <at> gnu.org> writes:

>> Date: Wed, 09 Jul 2025 14:29:17 -0700
>> From:  Steven Allen via "Bug reports for GNU Emacs,
>>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
>> 
>> 
>> `url-build-query-string' fails to escape literal `%' characters and keys
>> and values. To reproduce, run the following and note that it's encoded
>> as "%%3D" when it should be encoded as "%25%3D".
>> 
>>     emacs --batch --eval '(message "%s" (url-build-query-string (list (list "key" "%="))))'
>> 
>> The root cause appears to be that `%' is explicitly allowed in
>> `url-host-allowed-chars' (inherited by
>> `url-query-key-value-allowed-chars'), apparently to avoid re-encoding
>> %-encoded sequences. Unfortunately, changing this will definitely break
>>   things (e.g., `url-encode-url' will no longer round-trip).
>> 
>> The direct/simple fix would be to forbid `%' in
>> `url-query-key-value-allowed-chars'.
>> 
>> This issue also appears to affect Eglot's `eglot-path-to-uri' function
>> (`eglot--uri-path-allowed-chars' allows `%' in path-names),
>> but I haven't been able to find a stand-alone reproducer.
>
> How about documenting that url-build-query-string should already have
> any literal % characters encoded as %25, and changing all the callers
> to abide by that requirement?

That sounds like a reasonable solution for `url-host-allowed-chars',
`url-path-allowed-chars', `url-query-allowed-chars', and 
`url-query-key-value-allowed-chars':

1. Allowing `%' is "correct" as `%' is, in fact, allowed in URLs.

2. These variables are used in conjunction with `url-hexify-string' to
escape strings that may already be partially %-encoded.

3. From what I can tell, the documentation in these cases is technically
correct already, if a bit misleading.

However, IMO, `url-build-query-string' should escape everything, 
including literal `%' characters:

1. `url-parse-query-string', the inverse of `url-build-query-string',
decodes %-encoded strings. In order for these two functions to
round-trip, `url-build-query-string' needs to escape `%'. Specifically,
the following test should pass:

    (ert-deftest url-query-string-round-trips ()
      (let* ((query-string "foo=%25%3D")
             (parsed-query (url-parse-query-string query-string))
             (round-trip (url-build-query-string parsed-query)))
        (should (equal round-trip query-string))))

2. `url-build-query-string' is a higher-level tool to transform structured
data into a query string. IMO, the user shouldn't ahve to think about
pre-escaping their strings here.

I'm happy to provide patches (both to improve the docs and fix this
issue) if you think this is a reasonable approach.




This bug report was last modified 12 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.