GNU bug report logs - #11860
24.1; Arabic - Harakat (diacritics, short vowels) don't appear

Previous Next

Package: emacs;

Reported by: Steffan <smias <at> yandex.ru>

Date: Wed, 4 Jul 2012 18:43:12 UTC

Severity: normal

Found in version 24.1

Done: Stefan Kangas <stefan <at> marxist.se>

Bug is archived. No further changes may be made.

Full log


Message #62 received at 11860 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Kenichi Handa <handa <at> gnu.org>, Jason Rumney <jasonr <at> gnu.org>
Cc: 11860 <at> debbugs.gnu.org, smias <at> yandex.ru
Subject: Re: bug#11860: 24.1;
	Arabic - Harakat (diacritics, short vowels) don't appear
Date: Sun, 19 Aug 2012 21:22:40 +0300
> From: Kenichi Handa <handa <at> gnu.org>
> Cc: eliz <at> gnu.org, 11860 <at> debbugs.gnu.org, smias <at> yandex.ru
> Date: Sat, 18 Aug 2012 11:45:27 +0900
> 
> So, apparently Emacs on Windows and GNU/Linux uses the
> different metrics of glyphs.  As the shaper on GNU/Linux
> (m17n-lib library) works correctly for the same font, and
> the other applications on Windows have no problem, I suspect
> that the problem is in Emacs' interface with uniscribe
> (w32font.c or w32uniscribe.c).

I agree.

> If this problem happens only for bidi scripts

Can you suggest how to test this hypothesis?

> one possibility is that Emacs's rendering engine (xdisp.c) expects
> glyphs in a glyph-string are rendered in that order from left to
> right, but the returned glyph-string on Windows should be rendered
> in reverse order.

You may be right, but it's hard to be sure.  At least the advances[]
array returned by ScriptPlace seems to point into that direction.
Here's what I see in the debugger:

  Breakpoint 8, uniscribe_shape (lgstring=55041941) at w32uniscribe.c:373
  373                       LGLYPH_SET_CHAR (lglyph, chars[items[i].iCharPos
  (gdb) p items <at> nitems
  $1 = {0x35195a0}
  (gdb) p items[0]@nitems
  $2 = {{
      iCharPos = 0,
      a = {
	eScript = 26,
	fRTL = 1,
	fLayoutRTL = 1,
	fLinkBefore = 0,
	fLinkAfter = 0,
	fLogicalOrder = 1,
	fNoGlyphIndex = 0,
	s = {
	  uBidiLevel = 1,
	  fOverrideDirection = 0,
	  fInhibitSymSwap = 0,
	  fCharShape = 0,
	  fDigitSubstitute = 0,
	  fInhibitLigate = 0,
	  fDisplayZWG = 0,
	  fArabicNumContext = 0,
	  fGcpClusters = 0,
	  fReserved = 0,
	  fEngineReserved = 0
	}
      }
    }}
  (gdb) p nitems
  $3 = 1
  (gdb) p nglyphs
  $4 = 2
  (gdb) p advances[0]@nglyphs
  $5 = {8, 0}
  (gdb) p offsets[0]@nglyphs
  $6 = {{
      du = 0,
      dv = 0
    }, {
      du = 1,
      dv = -2
    }}
  (gdb) p chars[0]@2
  $7 = L"\x639\x652"

(Note that the fRTL member of items[0].a is set to TRUE.)  My
understanding of the advances[] array is that it gives, for each glyph
in the cluster, the number of pixels to advance to the right after
drawing the glyph.  So the fact that it is 8 for the first (base)
character and zero for the second one tells me that this grapheme
cluster is supposed to be rendered in reverse order: first the Sukun,
then Ayin at the same location, and then advance by 8 pixels for the
next character.  Is this correct?

If it is correct, then how come the glyphs shown on GNU/Linux also
have non-zero value of xadvance:

  [0 1 1593 969 8 2 8 4 4 nil]
  [0 1 1618 760 0 -6 -3 8 -11 [-9 2 0]]

The value 8 after 969 comes directly from xadvance, as this code in
ftfont.c shows:

      LGLYPH_SET_WIDTH (lglyph, g->xadv >> 6);

Is the meaning of xadvance in libotf different from its meaning in
Uniscribe?  (And why is the glyph string element called WIDTH instead
of ADVANCE?)  If not, what am I missing?

> For instance, in the above case, we may have to render glyphs in
> this order (diacritical mark first):
> 
>   [0 1 1593 760 0 3 6 12 4 [1 -2 0]]
>   [0 1 1593 969 8 1 8 12 4 nil]

I tried the naive patch below, but it didn't quite work.  It seems
like those changes somehow prevented character composition.  Perhaps
Handa-san could give me some guidance here.

> I think the further debugging must be done by those who
> knows uniscribe, w32font.c, and w32uniscribe.c.

It's very hard, given that glyph-string documentation leaves a lot to
be desired, and the way its various components are used during drawing
is also left without clear documentation.  E.g., this:

    FROM-IDX and TO-IDX are used internally and should not be touched.

is not really helpful for explaining what are FROM-IDX and TO-IDX, so
how can I figure out whether the code you asked about is doing TRT?
And without knowing what is each component of glyph-string used for
during drawing, how can I compare the values produced by Uniscribe
APIs with what glyph-string needs?  If someone could explain all those
things, it would make debugging possible.  Otherwise, I'm just
randomly poking around...

Here's the patch I tried:

--- src/w32uniscribe.c~	2012-07-08 07:24:56.000000000 +0300
+++ src/w32uniscribe.c	2012-08-19 15:55:17.323623900 +0300
@@ -331,17 +331,13 @@ uniscribe_shape (Lisp_Object lgstring)
 		  Lisp_Object lglyph = LGSTRING_GLYPH (lgstring, lglyph_index);
 		  ABC char_metric;
 		  unsigned gl;
+		  int j1;
 
 		  if (NILP (lglyph))
 		    {
 		      lglyph = Fmake_vector (make_number (LGLYPH_SIZE), Qnil);
 		      LGSTRING_SET_GLYPH (lgstring, lglyph_index, lglyph);
 		    }
-		  /* Copy to a 32-bit data type to shut up the
-		     compiler warning in LGLYPH_SET_CODE about
-		     comparison being always false.  */
-		  gl = glyphs[j];
-		  LGLYPH_SET_CODE (lglyph, gl);
 
 		  /* Detect clusters, for linking codes back to
 		     characters.  */
@@ -365,6 +361,16 @@ uniscribe_shape (Lisp_Object lgstring)
 			    }
 			}
 		    }
+		  if (items[i].a.fRTL)
+		    j1 = to - (j - from);
+		  else
+		    j1 = j;
+
+		  /* Copy to a 32-bit data type to shut up the
+		     compiler warning in LGLYPH_SET_CODE about
+		     comparison being always false.  */
+		  gl = glyphs[j1];
+		  LGLYPH_SET_CODE (lglyph, gl);
 
 		  LGLYPH_SET_CHAR (lglyph, chars[items[i].iCharPos
 						 + from]);
@@ -372,13 +378,13 @@ uniscribe_shape (Lisp_Object lgstring)
 		  LGLYPH_SET_TO (lglyph, items[i].iCharPos + to);
 
 		  /* Metrics.  */
-		  LGLYPH_SET_WIDTH (lglyph, advances[j]);
+		  LGLYPH_SET_WIDTH (lglyph, advances[j1]);
 		  LGLYPH_SET_ASCENT (lglyph, font->ascent);
 		  LGLYPH_SET_DESCENT (lglyph, font->descent);
 
 		  result = ScriptGetGlyphABCWidth (context,
 						   &(uniscribe_font->cache),
-						   glyphs[j], &char_metric);
+						   glyphs[j1], &char_metric);
 		  if (result == E_PENDING && !context)
 		    {
 		      /* Cache incomplete... */
@@ -387,7 +393,7 @@ uniscribe_shape (Lisp_Object lgstring)
 		      old_font = SelectObject (context, FONT_HANDLE (font));
 		      result = ScriptGetGlyphABCWidth (context,
 						       &(uniscribe_font->cache),
-						       glyphs[j], &char_metric);
+						       glyphs[j1], &char_metric);
 		    }
 
 		  if (SUCCEEDED (result))
@@ -399,17 +405,17 @@ uniscribe_shape (Lisp_Object lgstring)
 		  else
 		    {
 		      LGLYPH_SET_LBEARING (lglyph, 0);
-		      LGLYPH_SET_RBEARING (lglyph, advances[j]);
+		      LGLYPH_SET_RBEARING (lglyph, advances[j1]);
 		    }
 
-		  if (offsets[j].du || offsets[j].dv)
+		  if (offsets[j1].du || offsets[j1].dv)
 		    {
 		      Lisp_Object vec;
 		      vec = Fmake_vector (make_number (3), Qnil);
-		      ASET (vec, 0, make_number (offsets[j].du));
-		      ASET (vec, 1, make_number (offsets[j].dv));
+		      ASET (vec, 0, make_number (offsets[j1].du));
+		      ASET (vec, 1, make_number (offsets[j1].dv));
 		      /* Based on what ftfont.c does... */
-		      ASET (vec, 2, make_number (advances[j]));
+		      ASET (vec, 2, make_number (advances[j1]));
 		      LGLYPH_SET_ADJUSTMENT (lglyph, vec);
 		    }
 		  else





This bug report was last modified 4 years and 275 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.