GNU bug report logs -
#70000
29.2; Grapheme handling incorrect
Previous Next
Reported by: Phillip Susi <phill <at> thesusis.net>
Date: Mon, 25 Mar 2024 18:47:01 UTC
Severity: normal
Tags: notabug, wontfix
Found in version 29.2
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
Message #27 received at control <at> debbugs.gnu.org (full text, mbox):
tags 70000 wontfix
close 70000
thanks
> From: Stefan Kangas <stefankangas <at> gmail.com>
> Date: Fri, 28 Feb 2025 19:04:51 -0800
> Cc: Phillip Susi <phill <at> thesusis.net>, 70000 <at> debbugs.gnu.org
>
> Eli Zaretskii <eliz <at> gnu.org> writes:
>
> >> From: Phillip Susi <phill <at> thesusis.net>
> >> Cc: 70000 <at> debbugs.gnu.org
> >> Date: Wed, 27 Mar 2024 10:11:30 -0400
> >>
> >> Eli Zaretskii <eliz <at> gnu.org> writes:
> >>
> >> > Querying the cursor position won't help in this case because it is
> >> > Emacs that moves the cursor when you type C-f, not the terminal.
> >>
> >> I'm not talking about C-f, but simply displaying the characters on the
> >> screen. Emacs assumes the width is 4 when it prints this character, and
> >> so it thinks that the cursor moved over 4 places. When the terminal
> >> actually only moves the cursor over 2 spaces, emacs gets out of sync
> >> with the terminal, and massive breakage occurs.
> >
> > I understand what you are saying, but this is not how Emacs display
> > code works. It needs to know the width of every character displayed
> > on the screen, and it needs to be able to determine that even without
> > actually displaying the character.
> >
> > When Emacs is about to redraw some portion of the screen, it moves the
> > cursor to that place. To be able to move the cursor there, it needs
> > to be able to compute the coordinates on the screen of every character
> > that is currently shown, so it can construct the command for the
> > terminal driver to move cursor to that place. If Emacs were to rely
> > on displaying characters for that, it would have needed to constantly
> > redraw large portions of the screen, and that would both be much
> > slower and cause unpleasant flickering of the display, due to
> > redrawing of screen portions that don't actually change.
> >
> > So this technique is out of the question for Emacs.
> >
> >> By reading back the cursor position from the terminal after displaying a
> >> grapheme cluster, it would learn how the terminal displayed it and
> >> update its idea of where the cursor is correctly.
> >
> > I understand. But Emacs needs this information also long after the
> > characters were already drawn. For example, imagine that Emacs
> > displays these characters on the screen, and then leaves most of the
> > screen intact and periodically redraws some small portion of the
> > screen, like updating current time in the lower-right corner of the
> > screen when Emacs is otherwise idle. To do that, Emacs needs to move
> > the cursor from its current position somewhere on the screen to the
> > lower-right corner, redraw the time there, then move the cursor back
> > to where it was. These cursor moves are based on the ability to
> > calculate the geometry of each character on display without actually
> > writing the characters to the screen.
> >
> > In addition, if Emacs had to query the cursor position after each
> > written character, its redisplay would be much slower than it is now.
> >
> >> I originally ran into this problem not with a ZWJ, but with an emoji
> >> followed by alternate selector 16 that someone used in a subject line of
> >> an email, and when browsing my inbox with notmuch, the terminal went
> >> FUBAR.
> >
> > Yes, that's a known issue with some of the terminal emulators that
> > compose Emoji and other similar character sequences into grapheme
> > clusters, while ignoring the width that is expected from the result.
> > I'm not aware of any good solution, unfortunately. Sometimes,
> > disabling auto-composition-mode helps, but even that cannot solve all
> > the problems, especially when each of the characters composed by the
> > terminal into a single grapheme cluster has non-zero width according
> > to the Unicode tables. (If only the first character in the composed
> > sequence has non-zero width and the rest are zero-width, disabling
> > auto-composition-mode might produce a correct display.)
> >
> > The bottom line is what I said at the beginning: we need some protocol
> > by which a terminal emulator could be queried about whether it
> > supports character composition, and if so, what is the screen width of
> > a given sequence of codepoints that will be composed, without actually
> > displaying them. Better yet, some standard table of such widths could
> > be accepted by complying terminal emulators, and then Emacs could use
> > such a table to know the width in advance (similarly to how it knows
> > that from the Unicode data files).
> >
> > Until such protocols or tables exist, Emacs will be unable to produce
> > correct display on these terminal emulators.
>
> It seems to me like this should be closed as a wontfix?
Yes, now done.
This bug report was last modified 82 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.