GNU bug report logs -
#52670
legacy base64 encoding of latin-1
Previous Next
Reported by: mattiase <at> acm.org
Date: Sun, 19 Dec 2021 21:48:01 UTC
Severity: normal
Done: Mattias Engdegård <mattiase <at> acm.org>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
For what appears to be historical reasons, the base64 encoding functions (base64-encode-string etc) treat characters in the range U+0080..U+00FF as if they were raw bytes in the 127..255 range. This means that
(base64-encode-string "ÿ")
and
(base64-encode-string "\xff")
return the same result although the strings are completely different. Attempts to encode other multibyte characters fail (correctly). For example,
(base64-encode-string "Ÿ")
signals an error, as expected.
I propose we tighten up the behavior by eliminating the legacy handling of characters in the U+0080..U+00FF range. Letting the bug stay in place enables incorrect, brittle and error-prone usage: the functions are clearly intended to be fed encoded text only and should signal an error when not, as stated in the manual.
This bug report was last modified 3 years and 154 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.