GNU bug report logs -
#20789
auto-generate more Unicode data from sources
Previous Next
Full log
Message #41 received at 20789 <at> debbugs.gnu.org (full text, mbox):
> From: Glenn Morris <rgm <at> gnu.org>
> Cc: Kenichi Handa <handa <at> gnu.org>, 20789 <at> debbugs.gnu.org
> Date: Fri, 26 Jun 2015 22:02:36 -0400
>
> Eli Zaretskii wrote:
>
> >> The width 2 characters look like they might be the "W" and "F" characters,
> >
> > Yes.
> >
> >> but just doing that gives a list that has many differences to the list
> >> Emacs uses.
>
> This is list of "F" and "W" characters, compared to the 11 ranges that
> Emacs uses:
Looks good to me. The 11 ranges we have now are either identical or
more coarse than the list derived from the UCD that you show.
> > I don't see any significant differences, except perhaps in unassigned
> > codepoints (see paragraph 6.1 of UAX#11 for the treatment of
> > unassigned CJK codepoints).
>
> I don't know if this means that the above needs modifying?
I was saying that we need to augment the list with the 5 ranges of
unassigned codepoints that belong to the CJK planes, as described in
that section of UAX#11. An unassigned codepoint has its
'general-category' property set to 'Cn', and the list of the 5 planes
could be in some defconst, because it will probably never change.
Thanks.
This bug report was last modified 9 years and 356 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.