GNU bug report logs -
#58168
string-lessp glitches and inconsistencies
Previous Next
Full log
Message #41 received at 58168 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
1 okt. 2022 kl. 12.02 skrev Lars Ingebrigtsen <larsi <at> gnus.org>:
> Funnily enough, the latter displays in a different way for me, which may
> or may not be a bug:
>
> This is with `display-raw-bytes-as-hex' t.
You are right, that is completely broken -- display-raw-bytes-as-hex shouldn't affect the display of C1 controls.
Whether (string 128) displays "\200" or "\x80", however tarted up in a fancy face, it's still a lie. Only something like "\u0080" would actually be correct.
It seems to be a relic from the pre-Unicode days of Emacs: the code responsible muddles the display of raw bytes and unicode controls.
The attached patch untangles the two somewhat and lets display-raw-bytes-as-hex do what its name and documentation suggest, while using a non-confusing display for C1 controls.
The command
(insert "C1: " (string 128) " raw: " (unibyte-string 128) ".\n")
currently displays
C1: \200 raw: \200.
or
C1: \x80 raw: \x80.
depending on display-raw-bytes-as-hex. With the patch, we get
C1: \u0080 raw: \200.
or
C1: \u0080 raw: \x80.
which should satisfy everyone. What about it?
[unicode-escape-display.diff (application/octet-stream, attachment)]
[Message part 3 (text/plain, inline)]
I see that the redisplay-testsuite.el needs amending too; it actually looks buggy in this respect. If the above approach is deemed acceptable, I'll submit a patch that includes that file as well.
This bug report was last modified 2 years and 276 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.