GNU bug report logs - #34862
27.0.50; Trying to update pinyin.map

Package: emacs;

Reported by: Eric Abrahamsen <eric <at> ericabrahamsen.net>

Date: Thu, 14 Mar 2019 21:52:01 UTC

Severity: wishlist

Found in version 27.0.50

Message #11 received at 34862 <at> debbugs.gnu.org (full text, mbox):

From: Eric Abrahamsen <eric <at> ericabrahamsen.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 34862 <at> debbugs.gnu.org
Subject: Re: bug#34862: 27.0.50; Trying to update pinyin.map
Date: Thu, 14 Mar 2019 22:58:14 -0700

On 03/15/19 07:03 AM, Eli Zaretskii wrote:
>> From: Eric Abrahamsen <eric <at> ericabrahamsen.net>
>> Date: Thu, 14 Mar 2019 14:49:51 -0700
>> 
>> 
>> As discussed in bug#34215, I'm trying to update the
>> romanization-to-Chinese-character mapping in the
>> file ./leim/MISC-DIC/pinyin.map to use the more complete mapping
>> provided by the Google pinyin input method, licensed under Apache 2.0.
>> This expands the number of characters recognized by Emacs from around
>> 7,000 to around 17,000. (And increases the size of the mapping file from
>> 18K to 53K.)
>> 
>> I'm running into encoding problems when adding the new characters --
>> Emacs says some of the characters can't be written using the existing
>> coding system. The original file has an encoding cookie reading coding:
>> cn-gb-2312, and describing the coding system gives me:
>> 
>> chinese-iso-8bit-dos (alias: cn-gb-2312-dos euc-china-dos euc-cn-dos
>>   cn-gb-dos gb2312-dos)
>> 
>> The characters *can* be encoded using gb18030, and of course utf8. The
>> wikipedia page for gb18030 describes gb2312 as "legacy"[1], and says
>> gb18030 is a superset of 2312.
>> 
>> Is there any reason not to go straight to utf8 for this file? If that's
>> not okay, would gb18030 be acceptable?
>
> I'm not sure I understand the encoding of which file would you like to
> change?  Could you please clarify?

Sorry, I'm trying to add more characters to ./leim/MISC-DIC/pinyin.map,
which is encoded as chinese-iso-8bit-dos, and it can't accept the new
characters with that current encoding. That's the file I'd like to
change.

Thanks,
Eric

This bug report was last modified 175 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #34862 27.0.50; Trying to update pinyin.map

GNU bug report logs - #34862
27.0.50; Trying to update pinyin.map