GNU bug report logs -
#50983
28.0.50; [REGRESSION, BUG] Display bugs with uncommon characters
Previous Next
Reported by: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
Date: Sat, 2 Oct 2021 22:51:02 UTC
Severity: normal
Found in version 28.0.50
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Full log
Message #59 received at 50983 <at> debbugs.gnu.org (full text, mbox):
> From: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
> Date: Mon, 4 Oct 2021 11:35:41 +0330
> Cc: Alan Third <alan <at> idiocy.org>, 50983 <at> debbugs.gnu.org
>
> But the problem does not happen with vim (nor with emacs 27 for `weird.txt`), so it is clearly an interaction
> of different elements.
>
> Anyhow, I have opened an [upstream issue](https://github.com/kovidgoyal/kitty/issues/4094). Please
> subscribe to it so that you might offer your emacs expertise there, if needed.
I subscribed and posted the following comment:
Emacs uses character width tables computed from the latest Unicode
Standard version 14.0.0, using the data in the file
EastAsianWidth.txt. In that text, the U+00AD SOFT HYPHEN character,
which caused the problems in your file, has the East Asian Width
property value of A, which stands for "Ambiguous". The definition of
this value in the Unicode Standard Annex 11 (UAX#11) is as follows:
East Asian Ambiguous (A): All characters that can be sometimes wide
and sometimes narrow. Ambiguous characters require additional
information not contained in the character code to further resolve
their width.
Ambiguous characters occur in East Asian legacy character sets as
wide characters, but as narrow (i.e., normal-width) characters in
non-East Asian usage.
And since the file you show didn't have any East Asian legacy
characters, treating SOFT HYPHEN as narrow is IMO correct.
> > changing the "character encoding" setting in iTerm to ASCII
>
> This is a most loath workaround. I do want UTF-8, as I use mathematical symbols, emojis, and non-English
> languages. Anyhow, making the text full of random unrecognized characters is not much better than the
> current behavior.
It is better because it doesn't confuse the user regarding which
character is he or she editing.
But I agree with you that the results are hardly satisfactory, so my
recommendation is not to use Kitty in conjunction with Emacs.
This bug report was last modified 2 years and 317 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.