GNU bug report logs -
#20789
auto-generate more Unicode data from sources
Previous Next
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Eli Zaretskii wrote:
>> I don't suppose that big list can be auto-generated from the inputs?
>
> It's not trivial. I describe below some of the issues, in the hope
> that Someone⢠will volunteer:
Thanks. Script that processes Blocks.txt attached. Some questions:
1. In Blocks.txt:
FF00..FFEF; Halfwidth and Fullwidth Forms
In Emacs:
(#xFF00 #xFF5F cjk-misc)
(#xFF61 #xFF9F kana)
(#xFFE0 #xFFEF cjk-misc)
Is ff60 (FULLWIDTH RIGHT WHITE PARENTHESIS) intentionally omitted?
2. In Emacs "olt-italic" looks like a typo ("old-italic"). Can it be renamed?
3. In Blocks.txt, Anatolian Hieroglyphs ends at 1467F.
In Emacs, it ends at 1457F. Typo?
4. In Blocks.txt:
20000..2A6DF; CJK Unified Ideographs Extension B
2A700..2B73F; CJK Unified Ideographs Extension C
2B740..2B81F; CJK Unified Ideographs Extension D
2B820..2CEAF; CJK Unified Ideographs Extension E
2F800..2FA1F; CJK Compatibility Ideographs Supplement
In Emacs:
(#x20000 #x2CEAF han)
(#x2F800 #x2FFFF han)
Emacs adds the ranges 2a6e0:2a6ff and 2fa20:2ffff, which Blocks.txt does
not cover. Intentional?
5. Newly added "sutton-sign-writing" - should be "sutton-signwriting"?
(The case-insensitive source says "Sutton SignWriting".)
[blocks.awk (application/octet-stream, attachment)]
This bug report was last modified 9 years and 355 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.