GNU bug report logs - #52670
legacy base64 encoding of latin-1

Previous Next

Package: emacs;

Reported by: mattiase <at> acm.org

Date: Sun, 19 Dec 2021 21:48:01 UTC

Severity: normal

Done: Mattias Engdegård <mattiase <at> acm.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 52670 in the body.
You can then email your comments to 52670 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#52670; Package emacs. (Sun, 19 Dec 2021 21:48:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to mattiase <at> acm.org:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 19 Dec 2021 21:48:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: mattiase <at> acm.org
To: bug-gnu-emacs <at> gnu.org
Subject: legacy base64 encoding of latin-1
Date: Sun, 19 Dec 2021 22:47:15 +0100
For what appears to be historical reasons, the base64 encoding functions (base64-encode-string etc) treat characters in the range U+0080..U+00FF as if they were raw bytes in the 127..255 range. This means that

  (base64-encode-string "ÿ")

and

  (base64-encode-string "\xff")

return the same result although the strings are completely different. Attempts to encode other multibyte characters fail (correctly). For example,

  (base64-encode-string "Ÿ")

signals an error, as expected.

I propose we tighten up the behavior by eliminating the legacy handling of characters in the  U+0080..U+00FF range. Letting the bug stay in place enables incorrect, brittle and error-prone usage: the functions are clearly intended to be fed encoded text only and should signal an error when not, as stated in the manual.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#52670; Package emacs. (Mon, 20 Dec 2021 17:27:02 GMT) Full text and rfc822 format available.

Message #8 received at 52670 <at> debbugs.gnu.org (full text, mbox):

From: mattiase <at> acm.org
To: 52670 <at> debbugs.gnu.org
Subject: [PATCH] legacy base64 encoding of latin-1 
Date: Mon, 20 Dec 2021 18:26:15 +0100
[Message part 1 (text/plain, inline)]
It really looks like the erroneous behaviour was an unintended effect of commit 680d4b87f3d88a8b79f883cf3635036747588250. Anyway, here is a patch.

[0001-Fix-sloppy-base64-acceptance-of-some-multibyte-chara.patch (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#52670; Package emacs. (Mon, 20 Dec 2021 19:12:02 GMT) Full text and rfc822 format available.

Message #11 received at 52670 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: mattiase <at> acm.org
Cc: 52670 <at> debbugs.gnu.org
Subject: Re: bug#52670: [PATCH] legacy base64 encoding of latin-1
Date: Mon, 20 Dec 2021 21:10:56 +0200
> From: mattiase <at> acm.org
> Date: Mon, 20 Dec 2021 18:26:15 +0100
> 
> It really looks like the erroneous behaviour was an unintended effect of commit 680d4b87f3d88a8b79f883cf3635036747588250.

I think that patch was correct at the time it was done, but it wasn't
undone/fixed when we switched to Unicode.

> Anyway, here is a patch.

Thanks, but this should at the very least be announced as an
incompatible Lisp change in NEWS.




Reply sent to Mattias Engdegård <mattiase <at> acm.org>:
You have taken responsibility. (Mon, 20 Dec 2021 19:25:02 GMT) Full text and rfc822 format available.

Notification sent to mattiase <at> acm.org:
bug acknowledged by developer. (Mon, 20 Dec 2021 19:25:03 GMT) Full text and rfc822 format available.

Message #16 received at 52670-done <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 52670-done <at> debbugs.gnu.org
Subject: Re: bug#52670: [PATCH] legacy base64 encoding of latin-1
Date: Mon, 20 Dec 2021 20:24:21 +0100
20 dec. 2021 kl. 20:10 skrev Eli Zaretskii <eliz <at> gnu.org>:

> Thanks, but this should at the very least be announced as an
> incompatible Lisp change in NEWS.

Right, I added a detailed notice. Thanks for taking a look!

Pushed; closing.





bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 18 Jan 2022 12:24:09 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 154 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.