GNU bug report logs -
#50951
28.0.50; Urdu text is not displayed correctly
Previous Next
Reported by: Rah Guzar <aikrahguzar <at> gmail.com>
Date: Fri, 1 Oct 2021 20:19:01 UTC
Severity: normal
Tags: moreinfo
Found in version 28.0.50
Done: YAMAMOTO Mitsuharu <mituharu <at> math.s.chiba-u.ac.jp>
Bug is archived. No further changes may be made.
Full log
Message #26 received at 50951 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Sat, Oct 2, 2021 at 3:09 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
> The way to investigate such problems is to see what does hb-view, a
> program that is part of the HarfBuzz installation, produce for the
> same text with the same font. If hb-view produces correct display,
> but Emacs doesn't, then the problem is indeed in Emacs; otherwise the
> problem is probably with the font, and in any case should be taken up
> with the HarfBuzz developers.
>
I tried hb-view with NotoNastaliqUrdu and the text:
خوبی اپنی قسمت کی
This is what I get
[image: urduhbtestnoto.png]
While in emacs I discovered how it is displayed depends a lot on the font
size.
For the same text at size 16, I get
[image: emacsq16.png]
At size 24 it looks almost correct
[image: emacsq24.png]
At size 32 it is really bad again
[image: emacsq32.png]
And the issue seem to be glyph placement rather than shaping.
NotoNastaliqUrdu seems to be the only font with this issue. I am not sure
if the problem is due to Nastaliq.
The other two Nastaliq fonts seem to handle joining characters through
composition. If I change font using
(set-fontset-font t 'arabic (font-spec :family "Jameel Noori Nastaleeq"
:size 32))
and move cursor to the word "قسمت" which has 4 characters, the cursor
encompasses all of them and "C-u C-x u"
gives
-----------------------------------------------------------------------------------------------------------------------
position: 157 of 283 (55%), column: 11
character: ق (displayed as ق) (codepoint 1602, #o3102, #x642)
charset: unicode (Unicode (ISO10646))
code point in charset: 0x0642
script: arabic
syntax: w which means: word
category: .:Base, R:Right-to-left (strong), b:Arabic
to input: type "C-x 8 RET 642" or "C-x 8 RET ARABIC LETTER QAF"
buffer code: #xD9 #x82
file code: #xD9 #x82 (encoded by coding system utf-8-unix)
display: composed to form "قسمت" (see below)
Composed with the following character(s) "سمت" using this font:
ftcrhb:-pdms-Jameel Noori
Nastaleeq-normal-normal-normal-*-32-*-*-*-*-0-iso10646-1
by these glyphs:
[0 3 1578 11352 50 1 51 30 1 nil]
with these character(s):
س (#x633) ARABIC LETTER SEEN
م (#x645) ARABIC LETTER MEEM
ت (#x62a) ARABIC LETTER TEH
Character code properties: customize what to show
name: ARABIC LETTER QAF
general-category: Lo (Letter, Other)
decomposition: (1602) ('ق')
There are text properties here:
fontified nil
-----------------------------------------------------------------------------------------------------------------------------
Changing to NotoNastaliqUrdu using
(set-fontset-font t 'arabic (font-spec :family "NotoNastaliqUrdu" :size 32))
the cursor moves through one character at a time and moving the cursor to
the beginning of the same word
"C-u C-x =" gives
-----------------------------------------------------------------------------------------------------------------------------------------
position: 157 of 282 (55%), column: 11
character: ق (displayed as ق) (codepoint 1602, #o3102, #x642)
charset: unicode (Unicode (ISO10646))
code point in charset: 0x0642
script: arabic
syntax: w which means: word
category: .:Base, R:Right-to-left (strong), b:Arabic
to input: type "C-x 8 RET 642" or "C-x 8 RET ARABIC LETTER QAF"
buffer code: #xD9 #x82
file code: #xD9 #x82 (encoded by coding system utf-8-unix)
display: composed to form "ق" (see below)
Composed using this font:
ftcrhb:-GOOG-Noto Nastaliq
Urdu-normal-normal-normal-*-32-*-*-*-*-0-iso10646-1
by these glyphs:
[0 0 1602 16 0 -6 6 35 -26 [3 -16 0]]
[0 0 1602 983 0 0 0 0 0 nil]
[0 0 1602 284 8 -1 8 24 6 [0 -23 8]]
Character code properties: customize what to show
name: ARABIC LETTER QAF
general-category: Lo (Letter, Other)
decomposition: (1602) ('ق')
There are text properties here:
fontified t
-----------------------------------------------------------------------------------------------------------------------------------------
(Are you sure that LibreOffice uses NotoNastaliqUrdu for the text you
> type there? They could use a different font under the hood.)
>
LibreOffice uses something else by default and when I changed to
NotoNastaliqUrdu the appearance changes
and is the same as what I get with hb-view.
[Message part 2 (text/html, inline)]
[urduhbtestnoto.png (image/png, inline)]
[emacsq16.png (image/png, inline)]
[emacsq24.png (image/png, inline)]
[emacsq32.png (image/png, inline)]
This bug report was last modified 2 years and 241 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.