GNU bug report logs - #50983
28.0.50; [REGRESSION, BUG] Display bugs with uncommon characters

Previous Next

Package: emacs;

Reported by: Rudi C <rudiwillalwaysloveyou <at> gmail.com>

Date: Sat, 2 Oct 2021 22:51:02 UTC

Severity: normal

Found in version 28.0.50

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


Message #59 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
Cc: 50983 <at> debbugs.gnu.org, alan <at> idiocy.org
Subject: Re: bug#50983: 28.0.50;
 [REGRESSION, BUG] Display bugs with uncommon characters
Date: Mon, 04 Oct 2021 15:40:05 +0300
> From: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
> Date: Mon, 4 Oct 2021 11:35:41 +0330
> Cc: Alan Third <alan <at> idiocy.org>, 50983 <at> debbugs.gnu.org
> 
> But the problem does not happen with vim (nor with emacs 27 for `weird.txt`), so it is clearly an interaction
> of different elements. 
> 
> Anyhow, I have opened an [upstream issue](https://github.com/kovidgoyal/kitty/issues/4094). Please
> subscribe to it so that you might offer your emacs expertise there, if needed.

I subscribed and posted the following comment:

Emacs uses character width tables computed from the latest Unicode
Standard version 14.0.0, using the data in the file
EastAsianWidth.txt.  In that text, the U+00AD SOFT HYPHEN character,
which caused the problems in your file, has the East Asian Width
property value of A, which stands for "Ambiguous".  The definition of
this value in the Unicode Standard Annex 11 (UAX#11) is as follows:

  East Asian Ambiguous (A): All characters that can be sometimes wide
  and sometimes narrow. Ambiguous characters require additional
  information not contained in the character code to further resolve
  their width.

    Ambiguous characters occur in East Asian legacy character sets as
    wide characters, but as narrow (i.e., normal-width) characters in
    non-East Asian usage.

And since the file you show didn't have any East Asian legacy
characters, treating SOFT HYPHEN as narrow is IMO correct.

> > changing the "character encoding" setting in iTerm to ASCII
> 
> This is a most loath workaround. I do want UTF-8, as I use mathematical symbols, emojis, and non-English
> languages. Anyhow, making the text full of random unrecognized characters is not much better than the
> current behavior.

It is better because it doesn't confuse the user regarding which
character is he or she editing.

But I agree with you that the results are hardly satisfactory, so my
recommendation is not to use Kitty in conjunction with Emacs.




This bug report was last modified 2 years and 317 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.