GNU bug report logs - #27544
25.1; Visualization of Unicode bidirectional marks

Previous Next

Package: emacs;

Reported by: Itai Berli <itai.berli <at> gmail.com>

Date: Sat, 1 Jul 2017 10:00:02 UTC

Severity: wishlist

Found in version 25.1

Fixed in version 29.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


Message #8 received at 27544 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Itai Berli <itai.berli <at> gmail.com>
Cc: 27544 <at> debbugs.gnu.org
Subject: Re: bug#27544: 25.1; Visualization of Unicode bidirectional marks
Date: Sat, 01 Jul 2017 13:36:24 +0300
> From: Itai Berli <itai.berli <at> gmail.com>
> Date: Sat, 1 Jul 2017 12:58:28 +0300
> 
> Emacs supports 12 Unicode bidirectional marks (ALM, RLM, LRM, LRE,
> RLE, LRO, RLO, PDF, FSI, LRI, RLI, and PDI), each of which displays as
> a very thin space. This raises two problems.
> 
> 1. On the one hand, the fact that these inherently invisible
> marks manifest, by default, as thin spaces undermines attempts at
> precise alignment and positioning. Moreover, in the case of LRM, RLM
> and ALM, this behavior contradicts explicit directions given in the
> Unicode
> Bidirectional Algorithm 8.0.0 specifications (section 2.6 Implicit
> Directional Marks):
> > they do not appear in the display
> (To my understanding, this is meant to apply to all bidi marks, even
> if only stated explicitly for LRM, RLM and ALM.)
> 
> 2. On the other hand, the fact that these spaces are so thin as to be
> barely noticeable, and the fact that
> they are indistinguishable from one another makes it difficult to debug
> and resolve strange and/or erroneous behavior that can happen in a
> bidi document, an example of which is given below.

The above is the default way these control characters are displayed.
This default was chosen so as to, on the one hand avoid making them
entirely invisible, as doing that was deemed un-Emacsy, and OTOH make
them barely visible, so that they won't disrupt the legibility of the
displayed text.

However, Emacs being Emacs, this is just the default, and it can be
changed.  The visual appearance of these (and other similar)
characters can be customized via the variable
'glyphless-char-display-control', which is described in the Emacs
manual, and in more detail in the ELisp manual.

> The solution to both problems is to make the bidi marks visible in
> `whitespace` mode only, and to give them glyphs that are (a) easy to
> notice, (b) distinguishable from other whitespace visualization glyphs, (c)
> distinct from one another.

You can do that using 'glyphless-char-display-control'.  If that is
somehow not enough, you could also define a display-table entry for
these characters, specifically for whitespace-mode.  Patches to that
effect are welcome (I think this should be a user option, if we want
such a feature).

> If we were able to visualize the whitespace, we would have realized from
> the beginning that the sequence of characters in this paragraph was, from
> left to right:
> 
> RTL-RTL-RLO-RLO-H-e-l-l-o-PDO-,-PDO-SPACE-w-o-r-l-d-!
> 
> Thus, our first three actions removed the first three characters, leaving us
> with:
> 
> RLO-H-e-l-l-o-PDO-,-PDO-SPACE-w-o-r-l-d-!
> 
> We now realize that even the final, correct form, is in fact littered
> with bidi errors and potential landmines!

Overriding the bidi attributes with the likes of RLO can indeed lead
to confusing display.  Emacs has functions that Lisp applications can
use to discover these confusing situations, where the application
would like to warn users.  See the description of
bidi-find-overridden-directionality in the ELisp manual.




This bug report was last modified 3 years and 180 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.