On Sat, Feb 01, 2025 at 09:10:04PM +0100, Tomas Volf wrote:
>
> Hello,
>
> I think I found a bug in the htmlprag module in guile-lib. When parsing
> attributes, the values are not properly decoded:
>
> --8<---------------cut here---------------start------------->8---
> scheme@(guile-user)> ,use (htmlprag)
> scheme@(guile-user)> (html->sxml "
")
> $1 = (*TOP* (hr (@ (aaa "bbb"ccc'ddd"))))
> scheme@(guile-user)> (html->sxml "")
> $2 = (*TOP* (a (@ (href "a&b"))))
> --8<---------------cut here---------------end--------------->8---
>
> I think that $1 should be "bbb\"ccc'ddd" and $2 should be "a&b".
Ouch. Have you contacted Oleg Kiselyov about it? He's usually pretty
responsive and very friendly.
> The annoying part is that this cannot really be changed now, because
> people (me included) already have workarounds in place, and
> automatically decoding now would lead to double decoding.
>
> I see few ways forward:
>
> 1. Document the current behavior and keep it as it is.
> 2. Add argument #:decode-attributes, defaulting to #f, to the relevant
> procedures, so that people can opt into the fixed behavior.
> 3. Introduce parameter %decode-attributes, so that people can opt into
> the fixed behavior.
>
> I am sure there are also other approaches possible.
If it were me, I'd take 2.
Cheers
--
tomás