GNU bug report logs -
#33729
27.0.50; Partial glyphs not rendered for Gujarati with Harfbuzz enabled (renders fine using m17n)
Previous Next
Reported by: Kaushal Modi <kaushal.modi <at> gmail.com>
Date: Thu, 13 Dec 2018 20:22:02 UTC
Severity: normal
Found in version 27.0.50
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
Message #62 received at 33729 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Sounds good to me.
On Thu, Dec 20, 2018 at 1:58 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
> Ping! Could someone on the Harfbuzz team please comment on the
> thoughts below? Khaled, Mohammad, Behdad?
>
> > Date: Mon, 17 Dec 2018 17:55:52 +0200
> > From: Eli Zaretskii <eliz <at> gnu.org>
> > Cc: dr.khaled.hosny <at> gmail.com, behdad <at> behdad.org, 33729 <at> debbugs.gnu.org,
> > far.nasiri.m <at> gmail.com, kaushal.modi <at> gmail.com
> >
> > > From: Glenn Morris <rgm <at> gnu.org>
> > > Cc: far.nasiri.m <at> gmail.com, dr.khaled.hosny <at> gmail.com,
> behdad <at> behdad.org, 33729 <at> debbugs.gnu.org, kaushal.modi <at> gmail.com
> > > Date: Sun, 16 Dec 2018 19:30:00 -0500
> > >
> > > > After some thinking, my conclusion is that we should import the
> > > > ISO 15924 database from https://unicode.org/iso15924/, use a script
> > > > similar to admin/unidata/blocks.awk to generate an alist from it that
> > > > maps Emacs script names to ISO 15924 tags, and then access that alist
> > > > from uni_script to get the correct script information to Harfbuzz.
> > > >
> > > > Patches implementing that are welcome.
> > >
> > > I live to write awk scripts. I'm not 100% sure what you want, but as a
> > > first example, the following takes
> > > http://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt
> > > as input and outputs lines of the form "(gujr . gujarati)".
> > >
> > > The aliases are so that the RHS matches charscript.el.
> > >
> > > If this is not right, please clarify exactly what the inputs and output
> > > should be.
> >
> > Thanks.
> >
> > It turns out I didn't have this figured out completely, and your
> > proposal forced me to dig some more into the relevant parts of Unicode
> > and Emacs. I found a few additional issues and considerations; for at
> > least some of them I'd like to hear the opinions of the Harfbuzz
> > developers.
> >
> > Here are the issues:
> >
> > . Contrary to my original thoughts, I now tend to think that a
> > separate char-table, say char-iso159240tag-table, that maps
> > character codepoints directly to the script tags, is a better
> > solution:
> > - it will allow a faster look up, obviously
> > - the subdivision of characters into scripts, as shown in
> > Unicode's Scripts.txt, is slightly different from what
> > char-script-table does, so a simple mapping from Emacs scripts
> > to ISO 15924 script tag will not do. For example, many
> > characters Emacs puts into 'latin' or 'symbol' scripts are in
> > the Common script according to Scripts.txt, and similarly for
> > the Inherited script. I imagine this is important for
> > Harfbuzz.
> >
> > . Whether to produce the character-to-script-tag mapping using the
> > UCD files, such as Scripts.txt and PropertyValueAliases.txt, or the
> > canonical ISO 15924 tags from https://unicode.org/iso15924/,
> > depends on whether the slight differences mentioned in
> > https://www.unicode.org/reports/tr24/#Relation_To_ISO15924 matter
> > for Harfbuzz. For example, ISO 15924 has separate tags for the
> > Fraktur and Gaelic varieties of the Latin script: does this
> > distinction matter for Harfbuzz?
> >
> > . Does Harfbuzz handle the issues mentioned in
> > https://www.unicode.org/reports/tr24/#Script_Anomalies, and in
> > particular the use case of decomposed characters which yield a
> > different script than their precomposed variants? This use case is
> > quite common in handling of character compositions, so it's
> > important to understand its implications before we decide on the
> > implementation.
> >
> > To summarize, unless the Harfbuzz guys advise differently, I'd prefer
> > processing Scripts.txt and PropertyValueAliases.txt into a list
> > similar to the one we produce in charscript.el, then generate a
> > char-table from that list.
> >
> > Thanks again for working on this.
> >
> >
> >
> >
>
--
behdad
http://behdad.org/
[Message part 2 (text/html, inline)]
This bug report was last modified 3 years and 22 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.