GNU bug report logs -
#39659
27.0.60; inappropriate han script definition in char-script-table
Previous Next
Reported by: ynyaaa <at> gmail.com
Date: Tue, 18 Feb 2020 13:52:01 UTC
Severity: normal
Found in version 27.0.60
Full log
View this message in rfc822 format
Eli Zaretskii <eliz <at> gnu.org> writes:
>> From: ynyaaa <at> gmail.com
>> Date: Tue, 18 Feb 2020 22:50:57 +0900
>>
>> 'han' script is defined in char-script-table as:
>> 2E80-2FDF han
>> 3200-9FFF han
>> F900-FAFF han
>> FE30-FE4F han
>> 1F200-1F2FF han
>> 20000-2A6DF han
>> 2A700-2EBEF han
>> 2F800-2FA1F han
>>
>> It is better to set values as:
>> 3200-33FF cjk-misc
>> 4DC0-4DFF cjk-misc
>> FE30-FE4F cjk-misc
>> 1F200-1F2FF cjk-misc
>>
>> If enclosed CJK Ideographs should be 'han' script,
>> enclosed Hanguls should be 'hangul' script,
>> enclosed Katakana should be 'kana' script,
>> and enclosed Numbers should be 'symbol' script.
>
> Please provide some rationale for the differences, just saying
> "better" and "should" doesn't explain why you think the changes are
> for the good.
>
> CC'ing Handa-san, who I hope will have some comments on this.
>
> Thanks.
Because they are not han characters.
I think that combinatorial characters are not han characters,
and that they are symbolic characters.
As for enclosed latin letters, they are treated as 'symbol' script.
249C-24B5 PARENTHESIZED LATIN SMALL LETTER *
24B6-24CF CIRCLED LATIN CAPITAL LETTER *
24D0-24E9 CIRCLED LATIN SMALL LETTER *
1F110-1F129 PARENTHESIZED LATIN CAPITAL LETTER *
1F130-1F149 SQUARED LATIN CAPITAL LETTER *
1F150-1F169 NEGATIVE CIRCLED LATIN CAPITAL LETTER *
1F170-1F189 NEGATIVE SQUARED LATIN CAPITAL LETTER *
1F12A TORTOISE SHELL BRACKETED LATIN CAPITAL LETTER S
1F12B CIRCLED ITALIC LATIN CAPITAL LETTER C
1F12C CIRCLED ITALIC LATIN CAPITAL LETTER R
1F18A CROSSED NEGATIVE SQUARED LATIN CAPITAL LETTER P
1F1A5 SQUARED LATIN SMALL LETTER D
If script is set to han, hangul or kana for combinatorial characters
which contain han, hangul or kana characters, script values are like below:
CodePoint Script Comment
3200-321E hangul enclosed hangul
321F - unassigned
3220-3247 han enclosed han
3248-324F symbol enclosed number
3250 symbol combined latin
3251-325F symbol enclosed number
3260-327E hangul enclosed hangul
327F symbol symbol
3280-32B0 han enclosed han
32B1-32BF symbol enclosed number
32C0-32CB han square character with han
32CC-32CF symbol square character with latin
32D0-32FE kana enclosed kana
32FF han square character with han
3300-3357 kana square character with kana
3358-3370 han square character with han
3371-337A symbol square character with latin
337B-337F han square character with han
3380-33DF symbol square character with latin
33E0-33FE han square character with han
33FF symbol square character with latin
4DC0-4DFF symbol symbol
FE30-FE44 symbol symbol for vertical
FE45-FE46 symbol symbol
FE47-FE48 symbol symbol for vertical
FE49-FE4F symbol symbol
1F200-1F202 kana enclosed/square character with kana
... - unassigned
1F210-1F212 han enclosed han
1F213 kana enclosed kana
1F214-1F248 han enclosed han
... - unassigned
1F250-1F251 han enclosed han
... - unassigned
1F260-1F265 symbol symbol
This bug report was last modified 5 years and 162 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.