GNU bug report logs -
#19993
25.0.50; Unicode fonts defective on Windows
Previous Next
Full log
View this message in rfc822 format
On Sat, Mar 07, 2015 at 10:14:16AM +0200, Eli Zaretskii wrote:
> > What can it mean that a font “supports a script”?
> >
> > Theoretically, it may mean that
> > • it “knows” all the characters in the script, and
> > • has enough extra infrastructure to shape these characters
> > into a correct glyphic representation.
> >
> > I may see that the second part may be described by one bit per
> > script. But what about the first one? A repertoir of a script
> > changes every year (sometimes several times per year). How can this
> > be encapsulated into a bit?
>
> All I know about this is what the MSDN documentation says:
>
> FONTSIGNATURE structure
>
> Contains information identifying the code pages and Unicode subranges
> for which a given font provides glyphs.
> [...]
> Members
>
> fsUsb
>
> A 128-bit Unicode subset bitfield (USB) identifying up to 126
> Unicode subranges. Each bit, except the two most significant bits,
> represents a single subrange. The most significant bit is always 1
> and identifies the bitfield as a font signature; the second most
> significant bit is reserved and must be 0. Unicode subranges are
> numbered in accordance with the ISO 10646 standard. For more
> information, see Unicode Subset Bitfields.
So this bits “identify” a subrange. Of course, nothing is said about
what this actually MEANS. So I did an experiment: Cour.ttf.
The following subrange is “identified”:
9 0400 - 04FF Cyrillic
0500 - 052F Cyrillic Supplement
2DE0 - 2DFF Cyrillic Extended-A
A640 - A69F Cyrillic Extended-B
What is actually supported:
0400 - 04FF Everything but 04d8,04d9 (Schwa, used in Cyrillic Azeri — but contemporary Azeri is written in Latin)
0500 - 052F Only 0500 - 0513, 051a - 051d supported
2DE0 - 2DFF None supported (5.1)
A640 - A69F None supported (5.1 and later)
Looking in DerivedAge.txt:
04D0..04EB ; 1.1 # [28] CYRILLIC CAPITAL LETTER A WITH BREVE..CYRILLIC SMALL LETTER BARRED O WITH DIAERESIS
0500..050F ; 3.2 # [16] CYRILLIC CAPITAL LETTER KOMI DE..CYRILLIC SMALL LETTER KOMI TJE
0510..0513 ; 5.0 # [4] CYRILLIC CAPITAL LETTER REVERSED ZE..CYRILLIC SMALL LETTER EL WITH HOOK
0514..0523 ; 5.1 # [16] CYRILLIC CAPITAL LETTER LHA..CYRILLIC SMALL LETTER EN WITH MIDDLE HOOK
So two characters of 1.1 are not supported, all characters of 3.2 and 5.0 are
supported, and part of 5.1 is supported.
Does it look like a good indication of anything? I would say no… Do
you know any other tool looking at this bitmap for choosing which font
to pick up for a particular character?
Ilya
This bug report was last modified 10 years and 155 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.