GNU bug report logs -
#20789
auto-generate more Unicode data from sources
Previous Next
Full log
Message #41 received at 20789 <at> debbugs.gnu.org (full text, mbox):
> From: Glenn Morris <rgm <at> gnu.org>
> Cc: Kenichi Handa <handa <at> gnu.org>, 20789 <at> debbugs.gnu.org
> Date: Fri, 26 Jun 2015 22:02:36 -0400
>
> Eli Zaretskii wrote:
>
> >> The width 2 characters look like they might be the "W" and "F" characters,
> >
> > Yes.
> >
> >> but just doing that gives a list that has many differences to the list
> >> Emacs uses.
>
> This is list of "F" and "W" characters, compared to the 11 ranges that
> Emacs uses:
Looks good to me. The 11 ranges we have now are either identical or
more coarse than the list derived from the UCD that you show.
> > I don't see any significant differences, except perhaps in unassigned
> > codepoints (see paragraph 6.1 of UAX#11 for the treatment of
> > unassigned CJK codepoints).
>
> I don't know if this means that the above needs modifying?
I was saying that we need to augment the list with the 5 ranges of
unassigned codepoints that belong to the CJK planes, as described in
that section of UAX#11. An unassigned codepoint has its
'general-category' property set to 'Cn', and the list of the 5 planes
could be in some defconst, because it will probably never change.
Thanks.
This bug report was last modified 10 years and 86 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.