GNU bug report logs -
#20789
auto-generate more Unicode data from sources
Previous Next
Full log
View this message in rfc822 format
> From: Glenn Morris <rgm <at> gnu.org>
> Cc: 20789 <at> debbugs.gnu.org
> Date: Wed, 17 Jun 2015 02:52:48 -0400
>
> Is there anything else in international/ that could benefit from being
> auto-generated?
Some. Things I've spotted:
. characters.el:
. The modify-category-entry calls -- they basically can be derived
from Blocks.txt
. The modify-syntax-entry and set-case-syntax calls can be derived
from the values of the 'general-category' property returned by
'get-char-code-property', perhaps augmented by 'paired-bracket'
and 'paired-type' properties
. The set-case-syntax-pair calls (perhaps use the data in
CaseFolding.txt, or even the case mapping information in
UnicodeData.txt)
. The setup of char-width-table -- I think the information is in
EastAsianWidth.txt, with background information described in
UAX#11 (http://www.unicode.org/reports/tr11/)
. The setup of char-acronym-table: at least some of the data is in
NameAliases.txt and NameList.txt
. fontset.el:
. The setup of script-representative-chars
. mule-cmds.el:
. The setting of locale-language-names -- the data is available in
IANA's Language Subtag Registry
(http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry)
and in ISO 639-2 (http://www.loc.gov/standards/iso639-2/,
http://www.loc.gov/standards/iso639-2/php/English_list.php)
TIA
P.S. It would be good to add to somewhere (admin/make-tarball.txt?) a
reminder to fetch all those reference files and regenerate their
dependencies, before we prepare a release.
This bug report was last modified 9 years and 355 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.