#50951 - 28.0.50; Urdu text is not displayed correctly

GNU bug report logs - #50951
28.0.50; Urdu text is not displayed correctly

Package: emacs;

Reported by: Rah Guzar <aikrahguzar <at> gmail.com>

Date: Fri, 1 Oct 2021 20:19:01 UTC

Severity: normal

Tags: moreinfo

Found in version 28.0.50

Done: YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>

Bug is archived. No further changes may be made.

Message #26 received at 50951 <at> debbugs.gnu.org (full text, mbox):

From: Rah Guzar <aikrahguzar <at> gmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 50951 <at> debbugs.gnu.org Subject: Re: bug#50951: Fwd: bug#50951: 28.0.50; Urdu text is not displayed correctly Date: Sat, 2 Oct 2021 16:19:01 +0200

[Message part 1 (text/plain, inline)]

On Sat, Oct 2, 2021 at 3:09 PM Eli Zaretskii <eliz <at> gnu.org> wrote: > The way to investigate such problems is to see what does hb-view, a > program that is part of the HarfBuzz installation, produce for the > same text with the same font. If hb-view produces correct display, > but Emacs doesn't, then the problem is indeed in Emacs; otherwise the > problem is probably with the font, and in any case should be taken up > with the HarfBuzz developers. > I tried hb-view with NotoNastaliqUrdu and the text: خوبی اپنی قسمت کی This is what I get [image: urduhbtestnoto.png] While in emacs I discovered how it is displayed depends a lot on the font size. For the same text at size 16, I get [image: emacsq16.png] At size 24 it looks almost correct [image: emacsq24.png] At size 32 it is really bad again [image: emacsq32.png] And the issue seem to be glyph placement rather than shaping. NotoNastaliqUrdu seems to be the only font with this issue. I am not sure if the problem is due to Nastaliq. The other two Nastaliq fonts seem to handle joining characters through composition. If I change font using (set-fontset-font t 'arabic (font-spec :family "Jameel Noori Nastaleeq" :size 32)) and move cursor to the word "قسمت" which has 4 characters, the cursor encompasses all of them and "C-u C-x u" gives ----------------------------------------------------------------------------------------------------------------------- position: 157 of 283 (55%), column: 11 character: ق‎ (displayed as ق‎) (codepoint 1602, #o3102, #x642) charset: unicode (Unicode (ISO10646)) code point in charset: 0x0642 script: arabic syntax: w which means: word category: .:Base, R:Right-to-left (strong), b:Arabic to input: type "C-x 8 RET 642" or "C-x 8 RET ARABIC LETTER QAF" buffer code: #xD9 #x82 file code: #xD9 #x82 (encoded by coding system utf-8-unix) display: composed to form "قسمت" (see below) Composed with the following character(s) "سمت" using this font: ftcrhb:-pdms-Jameel Noori Nastaleeq-normal-normal-normal-*-32-*-*-*-*-0-iso10646-1 by these glyphs: [0 3 1578 11352 50 1 51 30 1 nil] with these character(s): س (#x633) ARABIC LETTER SEEN م (#x645) ARABIC LETTER MEEM ت (#x62a) ARABIC LETTER TEH Character code properties: customize what to show name: ARABIC LETTER QAF general-category: Lo (Letter, Other) decomposition: (1602) ('ق') There are text properties here: fontified nil ----------------------------------------------------------------------------------------------------------------------------- Changing to NotoNastaliqUrdu using (set-fontset-font t 'arabic (font-spec :family "NotoNastaliqUrdu" :size 32)) the cursor moves through one character at a time and moving the cursor to the beginning of the same word "C-u C-x =" gives ----------------------------------------------------------------------------------------------------------------------------------------- position: 157 of 282 (55%), column: 11 character: ق‎ (displayed as ق‎) (codepoint 1602, #o3102, #x642) charset: unicode (Unicode (ISO10646)) code point in charset: 0x0642 script: arabic syntax: w which means: word category: .:Base, R:Right-to-left (strong), b:Arabic to input: type "C-x 8 RET 642" or "C-x 8 RET ARABIC LETTER QAF" buffer code: #xD9 #x82 file code: #xD9 #x82 (encoded by coding system utf-8-unix) display: composed to form "ق" (see below) Composed using this font: ftcrhb:-GOOG-Noto Nastaliq Urdu-normal-normal-normal-*-32-*-*-*-*-0-iso10646-1 by these glyphs: [0 0 1602 16 0 -6 6 35 -26 [3 -16 0]] [0 0 1602 983 0 0 0 0 0 nil] [0 0 1602 284 8 -1 8 24 6 [0 -23 8]] Character code properties: customize what to show name: ARABIC LETTER QAF general-category: Lo (Letter, Other) decomposition: (1602) ('ق') There are text properties here: fontified t ----------------------------------------------------------------------------------------------------------------------------------------- (Are you sure that LibreOffice uses NotoNastaliqUrdu for the text you > type there? They could use a different font under the hood.) > LibreOffice uses something else by default and when I changed to NotoNastaliqUrdu the appearance changes and is the same as what I get with hb-view.

[Message part 2 (text/html, inline)]

[emacsq16.png (image/png, inline)]

[emacsq24.png (image/png, inline)]

[emacsq32.png (image/png, inline)]

This bug report was last modified 2 years and 289 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #50951 28.0.50; Urdu text is not displayed correctly

GNU bug report logs - #50951
28.0.50; Urdu text is not displayed correctly