GNU bug report logs -
#70000
29.2; Grapheme handling incorrect
Previous Next
Reported by: Phillip Susi <phill <at> thesusis.net>
Date: Mon, 25 Mar 2024 18:47:01 UTC
Severity: normal
Tags: notabug, wontfix
Found in version 29.2
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
> From: jman <jman <at> city17.xyz>
> Cc: 70000 <at> debbugs.gnu.org, phill <at> thesusis.net
> Date: Sun, 29 Jun 2025 14:36:31 +0200
>
> Eli Zaretskii <eliz <at> gnu.org> writes:
>
> > The protocol described in
> >
> > https://sw.kovidgoyal.net/kitty/text-sizing-protocol/
> >
> > if we decide to implement it in Emacs, will need some non-trivial
> > changes in how Emacs currently accounts for character width on
> > display. That is because this protocol does NOT allow to query the
> > terminal about the display width of a string of characters. Instead,
> > it allows a program running on the terminal to instruct the terminal
> > about the display width it expects to get, and the terminal needs to
> > obey. What this means for Emacs is that we will have to add code
> > which will determine the expected display width of each composed
> > sequence of characters. By contrast, what we have now is that we
> > expect the display backend to tell us the display width.
> >
> > This is important because Emacs has code which performs layout
> > calculations by using the display code without actually displaying
> > anything. Cursor movement commands in Emacs, and many places within
> > the display engine, use these capabilities. When this code runs, it
> > needs some way of computing the display width of each glyph that will
> > or would be displayed. If we need to compute that ourselves, we will
> > need to add such a code, which currently doesn't exist.
> >
> > Beyond that, there's the issue of how widely will this protocol be
> > supported by terminal emulators other than kitty, and what should
> > Emacs do when it runs on a terminal which doesn't support this.
>
> Thank you Eli for the overview.
>
> I infer we're still at a point with no solution at the horizon (and unfortunately I cannot
> contribute one).
Even if implementing this protocol were the complete solution, someone
would need to code it for Emacs.
> Meanwhile, is there a suggested workaround for users of Emacs TTY? The issue is that multi-byte
> graphemes clusters are not correctly rendered. I've been suggested to play with
> `glyphless-char-display` but (IIUC) it only works with single-bytes graphemes. For example Emacs
> `describe-char` reports that the "writing hand" emoji is hex U+270D but the Emojipedia[0] describes
> it as (U+270D, U+FE0F), U-FE0F being the VS16 variant selector[1] so I am not sure I can just hide
> or replace it with something else.
If auto-composition mode is turned ON (it is by default), Emacs
expects the terminal to combine the modifier characters (such as
U+FE0F) with the preceding base character, producing a single glyph.
The width of that glyph is expected to be according to the width of
the base character as stored in char-width-table. As long as the
terminal behaves as Emacs expects, you should be okay. So the
suggested workaround is to find a terminal emulator which behaves like
described above or can be forced to behave like that. The sequence
U+270D followed by U+FE0F should thus work in most cases.
If you are talking about Emoji sequences that include characters which
are not modifiers (i.e., they are characters on their own right and
have non-zero width in char-width-table), things will generally not
work in Emacs, I'm afraid, not without some auxiliary protocol which
will allow Emacs to know the display width of an arbitrary sequence of
characters.
This bug report was last modified 12 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.