#19993 - 25.0.50; Unicode fonts defective on Windows

GNU bug report logs - #19993
25.0.50; Unicode fonts defective on Windows

Package: emacs;

Reported by: Ilya Zakharevich <nospam-abuse <at> ilyaz.org>

Date: Tue, 3 Mar 2015 22:32:01 UTC

Severity: normal

Found in version 25.0.50

View this message in rfc822 format

From: Ilya Zakharevich <ilya <at> math.berkeley.edu> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 19993 <at> debbugs.gnu.org Subject: bug#19993: 25.0.50; Unicode fonts defective on Windows Date: Sat, 7 Mar 2015 23:41:58 -0800

On Sat, Mar 07, 2015 at 10:14:16AM +0200, Eli Zaretskii wrote: > > What can it mean that a font “supports a script”? > > > > Theoretically, it may mean that > > • it “knows” all the characters in the script, and > > • has enough extra infrastructure to shape these characters > > into a correct glyphic representation. > > > > I may see that the second part may be described by one bit per > > script. But what about the first one? A repertoir of a script > > changes every year (sometimes several times per year). How can this > > be encapsulated into a bit? > > All I know about this is what the MSDN documentation says: > > FONTSIGNATURE structure > > Contains information identifying the code pages and Unicode subranges > for which a given font provides glyphs. > [...] > Members > > fsUsb > > A 128-bit Unicode subset bitfield (USB) identifying up to 126 > Unicode subranges. Each bit, except the two most significant bits, > represents a single subrange. The most significant bit is always 1 > and identifies the bitfield as a font signature; the second most > significant bit is reserved and must be 0. Unicode subranges are > numbered in accordance with the ISO 10646 standard. For more > information, see Unicode Subset Bitfields. So this bits “identify” a subrange. Of course, nothing is said about what this actually MEANS. So I did an experiment: Cour.ttf. The following subrange is “identified”: 9 0400 - 04FF Cyrillic 0500 - 052F Cyrillic Supplement 2DE0 - 2DFF Cyrillic Extended-A A640 - A69F Cyrillic Extended-B What is actually supported: 0400 - 04FF Everything but 04d8,04d9 (Schwa, used in Cyrillic Azeri — but contemporary Azeri is written in Latin) 0500 - 052F Only 0500 - 0513, 051a - 051d supported 2DE0 - 2DFF None supported (5.1) A640 - A69F None supported (5.1 and later) Looking in DerivedAge.txt: 04D0..04EB ; 1.1 # [28] CYRILLIC CAPITAL LETTER A WITH BREVE..CYRILLIC SMALL LETTER BARRED O WITH DIAERESIS 0500..050F ; 3.2 # [16] CYRILLIC CAPITAL LETTER KOMI DE..CYRILLIC SMALL LETTER KOMI TJE 0510..0513 ; 5.0 # [4] CYRILLIC CAPITAL LETTER REVERSED ZE..CYRILLIC SMALL LETTER EL WITH HOOK 0514..0523 ; 5.1 # [16] CYRILLIC CAPITAL LETTER LHA..CYRILLIC SMALL LETTER EN WITH MIDDLE HOOK So two characters of 1.1 are not supported, all characters of 3.2 and 5.0 are supported, and part of 5.1 is supported. Does it look like a good indication of anything? I would say no… Do you know any other tool looking at this bitmap for choosing which font to pick up for a particular character? Ilya

This bug report was last modified 10 years and 155 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #19993 25.0.50; Unicode fonts defective on Windows

GNU bug report logs - #19993
25.0.50; Unicode fonts defective on Windows