GNU bug report logs - #20789
auto-generate more Unicode data from sources

Package: emacs;

Reported by: Glenn Morris <rgm <at> gnu.org>

Date: Thu, 11 Jun 2015 22:06:02 UTC

Severity: wishlist

Found in version 25.0.50

View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: Glenn Morris <rgm <at> gnu.org>
Cc: 20789 <at> debbugs.gnu.org
Subject: bug#20789: Invalid script or charset name:	cuneiform-numbers-and-punctuation
Date: Wed, 17 Jun 2015 19:27:49 +0300

> From: Glenn Morris <rgm <at> gnu.org>
> Cc: 20789 <at> debbugs.gnu.org
> Date: Wed, 17 Jun 2015 02:52:48 -0400
> 
> Is there anything else in international/ that could benefit from being
> auto-generated?

Some.  Things I've spotted:

  . characters.el:

    . The modify-category-entry calls -- they basically can be derived
      from Blocks.txt

    . The modify-syntax-entry and set-case-syntax calls can be derived
      from the values of the 'general-category' property returned by
      'get-char-code-property', perhaps augmented by 'paired-bracket'
      and 'paired-type' properties

    . The set-case-syntax-pair calls (perhaps use the data in
      CaseFolding.txt, or even the case mapping information in
      UnicodeData.txt)

    . The setup of char-width-table -- I think the information is in
      EastAsianWidth.txt, with background information described in
      UAX#11 (http://www.unicode.org/reports/tr11/)

    . The setup of char-acronym-table: at least some of the data is in
      NameAliases.txt and NameList.txt

  . fontset.el:

    . The setup of script-representative-chars

  . mule-cmds.el:

    . The setting of locale-language-names -- the data is available in
      IANA's Language Subtag Registry
      (http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry)
      and in ISO 639-2 (http://www.loc.gov/standards/iso639-2/,
      http://www.loc.gov/standards/iso639-2/php/English_list.php)
      
TIA

P.S. It would be good to add to somewhere (admin/make-tarball.txt?) a
reminder to fetch all those reference files and regenerate their
dependencies, before we prepare a release.

This bug report was last modified 10 years and 87 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #20789 auto-generate more Unicode data from sources

GNU bug report logs - #20789
auto-generate more Unicode data from sources