GNU bug report logs -
#36070
27; feature request '(Describe Char Unidata List) to include 'kDefinition' value
Previous Next
Full log
Message #11 received at 36070 <at> debbugs.gnu.org (full text, mbox):
> On 4 Jun 2019, at 01:06, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
>> --8<---------------cut here---------------start------------->8---
>> Character code properties: customize what to show
>> name: CJK IDEOGRAPH-5165
>> general-category: Lo (Letter, Other)
>> decomposition: (20837) ('入')
>> --8<---------------cut here---------------end--------------->8---
>
> This comes from UnicodeData.txt, our source for the Unicode properties
> of all the characters. We parse it into uni-*.el files as part of the
> build.
>
>> The Readings table, in particular, is nice to have for the 'kDefinition'.
>>
>> --8<---------------cut here---------------start------------->8---
>> | Data type | Value |
>> |-------------+--------------------------|
>> | kDefinition | enter, come in(to), join |
>> | | |
>> --8<---------------cut here---------------end--------------->8---
>
> This comes from Unihan_Reading.txt, a different file that is part of
> the Unihan database.
>
> We don't currently have a property where to put this value, so we need
> first to extend the properties. And then we will need to parse the
> above file and populate the property. Patches welcome. Bonus points
> for reviewing other properties of the Unihan DB and adding whatever is
> useful. See UAX#38 (http://www.unicode.org/reports/tr38/), for the
> description of the properties.
Thanks for pointing this out. I definitely want to know more about the Unihan DB and extend the handling of this information.
-- Van
This bug report was last modified 6 years and 12 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.