GNU bug report logs -
#28339
25.2; Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Previous Next
Reported by: Nima Aryan <nimawebgard <at> gmail.com>
Date: Sun, 3 Sep 2017 16:41:01 UTC
Severity: normal
Found in version 25.2
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Full log
Message #71 received at 28339 <at> debbugs.gnu.org (full text, mbox):
> From: handa <handa <at> gnu.org>
> Cc: nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
> Date: Sat, 16 Sep 2017 10:32:57 +0900
>
> In article <83y3phmca8.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:
>
> > > Each Arabic character constitutes a grapheme cluster. Then, for the
> > > sequence "0646 0645 06CC 200C 0634 0648 062F", to which neighboring should
> > > 200C belongs to? Does Unicode define it?
>
> > I don't think Unicode defines that, but I thought the shaping engine
> > gives us back glyphs that don't include ZWNJ itself. Evidently,
> > that's not true, which I find strange.
>
> If ZWNJ is WITHIN a grapheme cluster (i.e. not at the edges
> of the cluster), the m17n lib does not return ZWNJ glyph.
>
> > > Anyway, is it convenient or inconvenient to be able to edit ZWNJ directly?
>
> > It's convenient. But we already support deletion of composed
> > characters, so I didn't think it mattered.
>
> If Unicode does not have a rule of ZWNJ handing, to delete ZWNJ, how a
> user know which to type; C-d or BS?
Above, you asked about Unicode definition as to which grapheme cluster
should ZWNJ belong. On that, I said I didn't think there's any
Unicode ruling (although to be sure, we should probably ask a question
on the Unicode mailing list).
But here, you are talking about deleting a ZWNJ from display, and
there Unicode does have a clear rule, see Section 23.2 there. A
pertinent quote (Implementation Notes, p.849):
As with all other alternate format characters, fonts should use an
invisible zero-width glyph for representation of both ZWJ and ZWNJ.
This seems to be a requirement for fonts, but it does convey what
Unicode thinks about displaying ZWNJ.
Emacs generally tries to display such control characters, because
hiding them from users is un-Emacsy. But in this case, it seems like
users expect us to hide it.
This bug report was last modified 4 years and 259 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.