GNU bug report logs - #28339
25.2; Emacs shows ZWNJ character (Zero Width non-Joiner) as Space

Previous Next

Package: emacs;

Reported by: Nima Aryan <nimawebgard <at> gmail.com>

Date: Sun, 3 Sep 2017 16:41:01 UTC

Severity: normal

Found in version 25.2

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: handa <handa <at> gnu.org>
Cc: nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
Subject: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Sat, 16 Sep 2017 10:24:06 +0300
> From: handa <handa <at> gnu.org>
> Cc: nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
> Date: Sat, 16 Sep 2017 10:32:57 +0900
> 
> In article <83y3phmca8.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > > Each Arabic character constitutes a grapheme cluster.  Then, for the
> > > sequence "0646 0645 06CC 200C 0634 0648 062F", to which neighboring should
> > > 200C belongs to?  Does Unicode define it?
> 
> > I don't think Unicode defines that, but I thought the shaping engine
> > gives us back glyphs that don't include ZWNJ itself.  Evidently,
> > that's not true, which I find strange.
> 
> If ZWNJ is WITHIN a grapheme cluster (i.e. not at the edges
> of the cluster), the m17n lib does not return ZWNJ glyph.
> 
> > > Anyway, is it convenient or inconvenient to be able to edit ZWNJ directly?
> 
> > It's convenient.  But we already support deletion of composed
> > characters, so I didn't think it mattered.
> 
> If Unicode does not have a rule of ZWNJ handing, to delete ZWNJ, how a
> user know which to type; C-d or BS?

Above, you asked about Unicode definition as to which grapheme cluster
should ZWNJ belong.  On that, I said I didn't think there's any
Unicode ruling (although to be sure, we should probably ask a question
on the Unicode mailing list).

But here, you are talking about deleting a ZWNJ from display, and
there Unicode does have a clear rule, see Section 23.2 there.  A
pertinent quote (Implementation Notes, p.849):

  As with all other alternate format characters, fonts should use an
  invisible zero-width glyph for representation of both ZWJ and ZWNJ.

This seems to be a requirement for fonts, but it does convey what
Unicode thinks about displaying ZWNJ.

Emacs generally tries to display such control characters, because
hiding them from users is un-Emacsy.  But in this case, it seems like
users expect us to hide it.




This bug report was last modified 4 years and 260 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.