From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping Resent-From: Eric Abrahamsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 27 Jan 2019 05:44:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 34215@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.154856781529741 (code B ref -1); Sun, 27 Jan 2019 05:44:01 +0000 Received: (at submit) by debbugs.gnu.org; 27 Jan 2019 05:43:35 +0000 Received: from localhost ([127.0.0.1]:46925 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gndEE-0007jc-Vj for submit@debbugs.gnu.org; Sun, 27 Jan 2019 00:43:35 -0500 Received: from eggs.gnu.org ([209.51.188.92]:38011) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gndED-0007jO-OM for submit@debbugs.gnu.org; Sun, 27 Jan 2019 00:43:34 -0500 Received: from lists.gnu.org ([209.51.188.17]:53030) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gndE8-0005na-CK for submit@debbugs.gnu.org; Sun, 27 Jan 2019 00:43:28 -0500 Received: from eggs.gnu.org ([209.51.188.92]:48156) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gndE7-0003TG-Ax for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 00:43:28 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_50,RCVD_IN_DNSWL_MED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gnd5n-00078y-J9 for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 00:34:52 -0500 Received: from mail.ericabrahamsen.net ([50.56.99.223]:41691) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gnd5n-00070g-AR for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 00:34:51 -0500 Received: from localhost (71-212-20-199.tukw.qwest.net [71.212.20.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) (Authenticated sender: eric@ericabrahamsen.net) by mail.ericabrahamsen.net (Postfix) with ESMTPSA id 704B13FB4D for ; Sun, 27 Jan 2019 05:34:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mail.ericabrahamsen.net; s=mail; t=1548567281; bh=hATpaYe8b5gfblQkICWNiyyFL7jF0VR8OY9FHuU4TfA=; h=From:To:Subject:Date:From; b=HAsjdzu51V8s45m5iapxQWt0meCNtuM9SX2wk8evWls+5UPAr6Sum5WyG+EFz7zNd H2Y4xEw7JClZuK7lsV7RXQClSMLVoxewhx2oNYnsbxfuOpf5JfnyzxLuxso2RNDTqC v8Jtc0218f81O5ylWyUrlRS+IuwXnXEnyp7PshZo= From: Eric Abrahamsen Date: Sat, 26 Jan 2019 21:34:39 -0800 Message-ID: <87imyafyts.fsf@ericabrahamsen.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 50.56.99.223 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: 0.9 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.1 (/) --=-=-= Content-Type: text/plain This bug report is apropos to this[1] emacs.devel thread. The basic idea is that in the Emacs sources there's a file containing a mapping between pinyin -- the most common Chinese romanization system -- and Chinese characters themselves. The mapping lives in leim/MISC-DIC/pinyin.map, and is converted into a quail input method by the `py-converter' function in titdic-cnv.el, which is part of the "make" process. I want this mapping to be available to elisp code in general, because it's useful for all kinds of other language utilities (searching Chinese characters using ascii letters, etc). pinyin.map is a plain text file, each line consisting of a romanized syllable, a TAB, and a string of the possible corresponding Chinese characters. `titdic-convert' parses this and feeds it to `quail-define-rules'. My first thought was to add an intermediate step, where `titdic-convert' first composes an alist, then feeds that alist to `quail-define-rules', which would also allow us access to the alist. The more I looked at it, the more hacky and awkward that approached seemed, and it's not like it would save any memory: you still end up with the data both in a quail method, and in a separate alist. So this proposed patch simply parses the same file in the same way, but in a different location. I've put it in china-util.el, but chinese.el would also be a reasonable spot. Both those files are concerned with encoding, but at least "china-util" gives the impression that it could be a grab-bag. I'm not sure this use of `source-directory' is particularly robust, but I don't know how else to handle it. Hope this will be considered! Eric [1]: https://lists.gnu.org/archive/html/emacs-devel/2019-01/msg00306.html --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001-New-constant-chinese-pinyin-character-map.patch >From f63b918057f7eaf6f8eebb28071ac17dd5ab3ff1 Mon Sep 17 00:00:00 2001 From: Eric Abrahamsen Date: Sat, 26 Jan 2019 20:11:23 -0800 Subject: [PATCH] New constant chinese-pinyin-character-map * lisp/language/china-util.el (chinese-pinyin-character-map): Constant holding an alist built from the pinyin-to-character mapping provided in the file pinyin.map. --- lisp/language/china-util.el | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/lisp/language/china-util.el b/lisp/language/china-util.el index 70710bac18..cdbd8e322f 100644 --- a/lisp/language/china-util.el +++ b/lisp/language/china-util.el @@ -30,7 +30,7 @@ ;;; Code: -;; Hz/ZW/EUC-TW encoding stuff +;; Hz/ZW/EUC-TW encoding stuff, also a pinyin-to-character mapping. ;; HZ is an encoding method for Chinese character set GB2312 used ;; widely in Internet. It is very similar to 7-bit environment of @@ -202,6 +202,30 @@ pre-write-encode-hz (let (last-coding-system-used) (encode-hz-region 1 (point-max))) nil)) + +;;; Elisp-accessible version of the pinyin-to-character mapping +;;; provided in leim/MISC-DIC/pinyin.map, which is otherwise only +;;; exposed to the quail input method. + +(eval-and-compile + (defconst chinese-pinyin-character-map + (let ((py-file (expand-file-name + "leim/MISC-DIC/pinyin.map" + source-directory)) + alst) + (with-temp-buffer + (insert-file-contents py-file) + (re-search-forward "^[^%]" (point-max) t) + (beginning-of-line) + (while (re-search-forward "^\\([[:ascii:]]+\\)\t\\(\\cc+\\)$" + (point-max) t) + (push (cons (match-string-no-properties 1) + (match-string-no-properties 2)) +alst)) + (nreverse alst))) + "An alist mapping pinyin syllables to Chinese characters. +Produced from data in pinyin.map.")) + ;; (provide 'china-util) -- 2.20.1 --=-=-=-- From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 27 Jan 2019 15:43:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eric Abrahamsen Cc: 34215@debbugs.gnu.org Received: via spool by 34215-submit@debbugs.gnu.org id=B34215.154860373430764 (code B ref 34215); Sun, 27 Jan 2019 15:43:01 +0000 Received: (at 34215) by debbugs.gnu.org; 27 Jan 2019 15:42:14 +0000 Received: from localhost ([127.0.0.1]:47732 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gnmZa-000808-Fw for submit@debbugs.gnu.org; Sun, 27 Jan 2019 10:42:14 -0500 Received: from eggs.gnu.org ([209.51.188.92]:48553) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gnmZX-0007zs-Vu for 34215@debbugs.gnu.org; Sun, 27 Jan 2019 10:42:12 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:35319) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gnmZS-0004pl-K3; Sun, 27 Jan 2019 10:42:06 -0500 Received: from [176.228.60.248] (port=3578 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1gnmZS-0004Hf-80; Sun, 27 Jan 2019 10:42:06 -0500 Date: Sun, 27 Jan 2019 17:41:50 +0200 Message-Id: <83r2cy3y69.fsf@gnu.org> From: Eli Zaretskii In-reply-to: <87imyafyts.fsf@ericabrahamsen.net> (message from Eric Abrahamsen on Sat, 26 Jan 2019 21:34:39 -0800) References: <87imyafyts.fsf@ericabrahamsen.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > From: Eric Abrahamsen > Date: Sat, 26 Jan 2019 21:34:39 -0800 > > So this proposed patch simply parses the same file in the same way, but > in a different location. I've put it in china-util.el, but chinese.el > would also be a reasonable spot. Both those files are concerned with > encoding, but at least "china-util" gives the impression that it could > be a grab-bag. How much does this add to Emacs memory footprint when loaded? Since this will be required only rarely, I doubt that it would be a good idea to force every user of Chinese language to pay the price, if it is significant. It would be better to have this as a separate file with autoloaded variable or function, IMO. > I'm not sure this use of `source-directory' is particularly robust, but > I don't know how else to handle it. source-directory might not exist in a given installation. Maybe we should have the data copied into that separate file I mentioned above. Thanks. From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping In-Reply-To: <87imyafyts.fsf@ericabrahamsen.net> Resent-From: Eric Abrahamsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 27 Jan 2019 18:04:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 34215@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.154861218719932 (code B ref -1); Sun, 27 Jan 2019 18:04:01 +0000 Received: (at submit) by debbugs.gnu.org; 27 Jan 2019 18:03:07 +0000 Received: from localhost ([127.0.0.1]:47909 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gnolu-0005BP-Fe for submit@debbugs.gnu.org; Sun, 27 Jan 2019 13:03:06 -0500 Received: from eggs.gnu.org ([209.51.188.92]:45040) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gnols-0005At-QU for submit@debbugs.gnu.org; Sun, 27 Jan 2019 13:03:05 -0500 Received: from lists.gnu.org ([209.51.188.17]:56388) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gnoln-0004b8-FB for submit@debbugs.gnu.org; Sun, 27 Jan 2019 13:02:59 -0500 Received: from eggs.gnu.org ([209.51.188.92]:55241) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gnolm-0000cJ-D5 for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 13:02:59 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: * X-Spam-Status: No, score=1.6 required=5.0 tests=BAYES_50,RDNS_NONE autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gnoll-0004aV-Go for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 13:02:58 -0500 Received: from [195.159.176.226] (port=44178 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gnolk-0004Zd-QU for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 13:02:57 -0500 Received: from list by blaine.gmane.org with local (Exim 4.89) (envelope-from ) id 1gnolh-000zM4-OT for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 19:02:53 +0100 X-Injected-Via-Gmane: http://gmane.org/ From: Eric Abrahamsen Date: Sun, 27 Jan 2019 10:02:48 -0800 Message-ID: <87a7jmf06v.fsf@ericabrahamsen.net> References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cancel-Lock: sha1:BWypqh86Vg3myYTb4Tn0j6VOFRU= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 195.159.176.226 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Eli Zaretskii writes: >> From: Eric Abrahamsen >> Date: Sat, 26 Jan 2019 21:34:39 -0800 >> >> So this proposed patch simply parses the same file in the same way, but >> in a different location. I've put it in china-util.el, but chinese.el >> would also be a reasonable spot. Both those files are concerned with >> encoding, but at least "china-util" gives the impression that it could >> be a grab-bag. > > How much does this add to Emacs memory footprint when loaded? I actually don't know how to measure the memory taken up by the contents of a variable, but I imagine it's fairly significant. Or maybe I could do a "before and after" measurement of all of Emacs. > Since this will be required only rarely, I doubt that it would be a > good idea to force every user of Chinese language to pay the price, if > it is significant. It would be better to have this as a separate file > with autoloaded variable or function, IMO. That sounds fine to me. I agree the data shouldn't be loaded unless it's been explicitly requested. >> I'm not sure this use of `source-directory' is particularly robust, but >> I don't know how else to handle it. > > source-directory might not exist in a given installation. > > Maybe we should have the data copied into that separate file I > mentioned above. I can imagine a few ways of doing that: 1. Just manually copy the data into a new file and add it to the repo (pinyin.map hasn't been updated in years). 2. Do the copy at build time. I'm not quite sure where that function would live, or how it would get called. 3. Use an `eval-and-compile' form as in the patch I provided. Is working back from `load-file-name' more reliable than using `source-directory'? Autoloading a variable seems to copy the value of the variable into the loaddefs file, so there's no point to that. I figure we can just ask people who want this value to require the library. Thanks, Eric PS: pinyin.map is ancient and is missing a lot of good correspondences. Google's pinyin input method uses a much larger map, licensed with Apache v2.0. This[1] seems to indicate that Apache 2.0 is okay for Gnu projects, maybe we could consider switching to that map? Footnotes: [1] https://www.gnu.org/licenses/license-list.en.html#apache2 From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 27 Jan 2019 18:15:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eric Abrahamsen Cc: 34215@debbugs.gnu.org Received: via spool by 34215-submit@debbugs.gnu.org id=B34215.154861289220946 (code B ref 34215); Sun, 27 Jan 2019 18:15:01 +0000 Received: (at 34215) by debbugs.gnu.org; 27 Jan 2019 18:14:52 +0000 Received: from localhost ([127.0.0.1]:47918 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gnoxI-0005Rm-3M for submit@debbugs.gnu.org; Sun, 27 Jan 2019 13:14:52 -0500 Received: from eggs.gnu.org ([209.51.188.92]:46643) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gnoxF-0005RW-LP for 34215@debbugs.gnu.org; Sun, 27 Jan 2019 13:14:50 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:37307) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gnox6-00005E-Ra; Sun, 27 Jan 2019 13:14:43 -0500 Received: from [176.228.60.248] (port=1232 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1gnox5-0006BF-Ol; Sun, 27 Jan 2019 13:14:40 -0500 Date: Sun, 27 Jan 2019 20:14:23 +0200 Message-Id: <83ef8y3r40.fsf@gnu.org> From: Eli Zaretskii In-reply-to: <87a7jmf06v.fsf@ericabrahamsen.net> (message from Eric Abrahamsen on Sun, 27 Jan 2019 10:02:48 -0800) References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> <87a7jmf06v.fsf@ericabrahamsen.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > From: Eric Abrahamsen > Date: Sun, 27 Jan 2019 10:02:48 -0800 > > >> I'm not sure this use of `source-directory' is particularly robust, but > >> I don't know how else to handle it. > > > > source-directory might not exist in a given installation. > > > > Maybe we should have the data copied into that separate file I > > mentioned above. > > I can imagine a few ways of doing that: > > 1. Just manually copy the data into a new file and add it to the repo > (pinyin.map hasn't been updated in years). > 2. Do the copy at build time. I'm not quite sure where that function > would live, or how it would get called. > 3. Use an `eval-and-compile' form as in the patch I provided. Is working > back from `load-file-name' more reliable than using > `source-directory'? 2 is what I had in mind. I don't think it matters where the code lives, it's small enough to not matter. It would be called like the various *-convert functions we invoke at build time to build the dictionaries needed for CJK input methods, see the files in the leim/ directory. > Autoloading a variable seems to copy the value of the variable into the > loaddefs file, so there's no point to that. I figure we can just ask > people who want this value to require the library. Right. > PS: pinyin.map is ancient and is missing a lot of good correspondences. > Google's pinyin input method uses a much larger map, licensed with > Apache v2.0. This[1] seems to indicate that Apache 2.0 is okay for Gnu > projects, maybe we could consider switching to that map? Maybe. Unfortunately, I don't know enough about these input methods to tell whether replacing the file is a good idea. I wonder who can we ask about this. From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping In-Reply-To: <87imyafyts.fsf@ericabrahamsen.net> Resent-From: Eric Abrahamsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 27 Jan 2019 19:19:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 34215@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.154861672927507 (code B ref -1); Sun, 27 Jan 2019 19:19:02 +0000 Received: (at submit) by debbugs.gnu.org; 27 Jan 2019 19:18:49 +0000 Received: from localhost ([127.0.0.1]:47971 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gnpxB-00079b-Ge for submit@debbugs.gnu.org; Sun, 27 Jan 2019 14:18:49 -0500 Received: from eggs.gnu.org ([209.51.188.92]:59189) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gnpx9-00079M-Qn for submit@debbugs.gnu.org; Sun, 27 Jan 2019 14:18:48 -0500 Received: from lists.gnu.org ([209.51.188.17]:48762) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gnpx4-0007BR-4r for submit@debbugs.gnu.org; Sun, 27 Jan 2019 14:18:42 -0500 Received: from eggs.gnu.org ([209.51.188.92]:41154) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gnpx3-0006iM-5r for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 14:18:42 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: * X-Spam-Status: No, score=1.6 required=5.0 tests=BAYES_50,RDNS_NONE autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gnpx2-0007Aa-77 for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 14:18:41 -0500 Received: from [195.159.176.226] (port=36534 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gnpx1-0007A8-SP for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 14:18:40 -0500 Received: from list by blaine.gmane.org with local (Exim 4.89) (envelope-from ) id 1gnpwz-0009UD-Ca for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 20:18:37 +0100 X-Injected-Via-Gmane: http://gmane.org/ From: Eric Abrahamsen Date: Sun, 27 Jan 2019 11:18:29 -0800 Message-ID: <875zu9gb96.fsf@ericabrahamsen.net> References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> <87a7jmf06v.fsf@ericabrahamsen.net> <83ef8y3r40.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cancel-Lock: sha1:ezC8LoBRQBg/hBv1wYCrhGjq0pY= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 195.159.176.226 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Eli Zaretskii writes: >> From: Eric Abrahamsen >> Date: Sun, 27 Jan 2019 10:02:48 -0800 >> >> >> I'm not sure this use of `source-directory' is particularly robust, but >> >> I don't know how else to handle it. >> > >> > source-directory might not exist in a given installation. >> > >> > Maybe we should have the data copied into that separate file I >> > mentioned above. >> >> I can imagine a few ways of doing that: >> >> 1. Just manually copy the data into a new file and add it to the repo >> (pinyin.map hasn't been updated in years). >> 2. Do the copy at build time. I'm not quite sure where that function >> would live, or how it would get called. >> 3. Use an `eval-and-compile' form as in the patch I provided. Is working >> back from `load-file-name' more reliable than using >> `source-directory'? > > 2 is what I had in mind. I don't think it matters where the code > lives, it's small enough to not matter. It would be called like the > various *-convert functions we invoke at build time to build the > dictionaries needed for CJK input methods, see the files in the leim/ > directory. Okay, I'll put that together and add it to one of the Makefiles. I suppose it could go in leim/Makefile.in, though it technically isn't part of leim, and I was expecting the resulting file to go to lisp/language/. But it would be convenient to put the generation function in titdic-cnv.el. >> Autoloading a variable seems to copy the value of the variable into the >> loaddefs file, so there's no point to that. I figure we can just ask >> people who want this value to require the library. > > Right. > >> PS: pinyin.map is ancient and is missing a lot of good correspondences. >> Google's pinyin input method uses a much larger map, licensed with >> Apache v2.0. This[1] seems to indicate that Apache 2.0 is okay for Gnu >> projects, maybe we could consider switching to that map? > > Maybe. Unfortunately, I don't know enough about these input methods > to tell whether replacing the file is a good idea. I wonder who can > we ask about this. It's more or less a drop-in replacement -- the format of the data would be the same, only a bit more of it. I'm not sure who is "in charge" of these files, though. Eric From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 27 Jan 2019 19:49:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eric Abrahamsen Cc: 34215@debbugs.gnu.org Received: via spool by 34215-submit@debbugs.gnu.org id=B34215.154861852530406 (code B ref 34215); Sun, 27 Jan 2019 19:49:02 +0000 Received: (at 34215) by debbugs.gnu.org; 27 Jan 2019 19:48:45 +0000 Received: from localhost ([127.0.0.1]:47986 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gnqQ9-0007uM-KK for submit@debbugs.gnu.org; Sun, 27 Jan 2019 14:48:45 -0500 Received: from eggs.gnu.org ([209.51.188.92]:35686) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gnqQ8-0007u9-TN for 34215@debbugs.gnu.org; Sun, 27 Jan 2019 14:48:45 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:38168) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gnqPw-0005Xt-5p; Sun, 27 Jan 2019 14:48:33 -0500 Received: from [176.228.60.248] (port=3016 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1gnqPp-0008IA-Mj; Sun, 27 Jan 2019 14:48:31 -0500 Date: Sun, 27 Jan 2019 21:48:10 +0200 Message-Id: <83a7jl51c5.fsf@gnu.org> From: Eli Zaretskii In-reply-to: <875zu9gb96.fsf@ericabrahamsen.net> (message from Eric Abrahamsen on Sun, 27 Jan 2019 11:18:29 -0800) References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> <87a7jmf06v.fsf@ericabrahamsen.net> <83ef8y3r40.fsf@gnu.org> <875zu9gb96.fsf@ericabrahamsen.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > From: Eric Abrahamsen > Date: Sun, 27 Jan 2019 11:18:29 -0800 > > >> PS: pinyin.map is ancient and is missing a lot of good correspondences. > >> Google's pinyin input method uses a much larger map, licensed with > >> Apache v2.0. This[1] seems to indicate that Apache 2.0 is okay for Gnu > >> projects, maybe we could consider switching to that map? > > > > Maybe. Unfortunately, I don't know enough about these input methods > > to tell whether replacing the file is a good idea. I wonder who can > > we ask about this. > > It's more or less a drop-in replacement -- the format of the data would > be the same, only a bit more of it. I understand, but I wonder if someone could try that for a while and see if it makes better input method(s), before we decide to import it. > I'm not sure who is "in charge" of these files, though. No one, I'm afraid. Not these days. From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping In-Reply-To: <87imyafyts.fsf@ericabrahamsen.net> Resent-From: Eric Abrahamsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 29 Jan 2019 17:50:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 34215@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.15487841962355 (code B ref -1); Tue, 29 Jan 2019 17:50:01 +0000 Received: (at submit) by debbugs.gnu.org; 29 Jan 2019 17:49:56 +0000 Received: from localhost ([127.0.0.1]:50789 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1goXWF-0000bv-W8 for submit@debbugs.gnu.org; Tue, 29 Jan 2019 12:49:56 -0500 Received: from eggs.gnu.org ([209.51.188.92]:50728) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1goXWE-0000bj-NN for submit@debbugs.gnu.org; Tue, 29 Jan 2019 12:49:55 -0500 Received: from lists.gnu.org ([209.51.188.17]:56077) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1goXW9-0006Uy-A9 for submit@debbugs.gnu.org; Tue, 29 Jan 2019 12:49:49 -0500 Received: from eggs.gnu.org ([209.51.188.92]:60922) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1goXW8-0007Oy-7O for bug-gnu-emacs@gnu.org; Tue, 29 Jan 2019 12:49:49 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: * X-Spam-Status: No, score=1.6 required=5.0 tests=BAYES_50,RDNS_NONE autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1goXW7-0006UN-BO for bug-gnu-emacs@gnu.org; Tue, 29 Jan 2019 12:49:48 -0500 Received: from [195.159.176.226] (port=58500 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1goXW6-0006Rd-04 for bug-gnu-emacs@gnu.org; Tue, 29 Jan 2019 12:49:47 -0500 Received: from list by blaine.gmane.org with local (Exim 4.89) (envelope-from ) id 1goXVt-000wSN-An for bug-gnu-emacs@gnu.org; Tue, 29 Jan 2019 18:49:33 +0100 X-Injected-Via-Gmane: http://gmane.org/ From: Eric Abrahamsen Date: Tue, 29 Jan 2019 09:48:30 -0800 Message-ID: <87va27cq35.fsf@ericabrahamsen.net> References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> <87a7jmf06v.fsf@ericabrahamsen.net> <83ef8y3r40.fsf@gnu.org> <875zu9gb96.fsf@ericabrahamsen.net> <83a7jl51c5.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cancel-Lock: sha1:vzuwQR63CAdo5baFnOuTW2CMLHw= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 195.159.176.226 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --=-=-= Content-Type: text/plain Eli Zaretskii writes: >> From: Eric Abrahamsen >> Date: Sun, 27 Jan 2019 11:18:29 -0800 I've attached a diff adding the conversion function itself, but I'm not familiar with makefiles and so far haven't been able to figure out how to call it. It looks like the invocation I want will look like: $(AM_V_GEN)${RUN_EMACS} -l titdic-cnv -f pinyin-convert \ ${srcdir}/MISC-DIC/pinyin.map ${srcdir}/../lisp/language/pinyin.el Where ${srcdir} is the leim directory, but I don't actually know how to get this code called by make... Additionally, I could factor the common code in py-converter and pinyin-convert out into a separate defsubst. >> >> PS: pinyin.map is ancient and is missing a lot of good correspondences. >> >> Google's pinyin input method uses a much larger map, licensed with >> >> Apache v2.0. This[1] seems to indicate that Apache 2.0 is okay for Gnu >> >> projects, maybe we could consider switching to that map? >> > >> > Maybe. Unfortunately, I don't know enough about these input methods >> > to tell whether replacing the file is a good idea. I wonder who can >> > we ask about this. >> >> It's more or less a drop-in replacement -- the format of the data would >> be the same, only a bit more of it. > > I understand, but I wonder if someone could try that for a while and > see if it makes better input method(s), before we decide to import it. FWIW, that mapping is used by the pyim package, which I believe is the most popular pinyin-based Chinese input method out there. I also use it via the system-wide input framework fcitx, and it works very well. >> I'm not sure who is "in charge" of these files, though. > > No one, I'm afraid. Not these days. That's too bad. Eric --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=pinyinconvert.diff diff --git a/lisp/international/titdic-cnv.el b/lisp/international/titdic-cnv.el index 2ce2c527b9..54d9fc6211 100644 --- a/lisp/international/titdic-cnv.el +++ b/lisp/international/titdic-cnv.el @@ -1203,6 +1203,37 @@ batch-miscdic-convert (miscdic-convert filename dir)))) (kill-emacs 0)) +(defun pinyin-convert () + "Convert text file pinyin.map into an elisp library. +The library is named pinyin.el, and contains the constant +`pinyin-character-map'." + (let ((src-file (car command-line-args-left)) + (dst-file (cadr command-line-args-left))) + (with-temp-file dst-file + (insert ";; This file is automatically generated from pinyin.map,\ + by the function pinyin-convert.") + (insert "(defconst pinyin-character-map\n(") + (let ((pos (point))) + (insert-file-contents src-file) + (goto-char pos) + (re-search-forward "^[a-z]") + (beginning-of-line) + (delete-region pos (point)) + (while (not (eobp)) + (insert "(\"") + (skip-chars-forward "a-z") + (insert "\" \"") + (delete-char 1) + (end-of-line) + (while (= (preceding-char) ?\r) + (delete-char -1)) + (insert "\")") + (forward-line 1))) + (insert ")\n\"An alist holding correspondences between pinyin syllables\ + and Chinese characters.\")\n") + (insert "(provide 'pinyin)\n")) + (kill-emacs 0))) + ;; Prevent "Local Variables" above confusing Emacs. --=-=-=-- From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 30 Jan 2019 17:11:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eric Abrahamsen Cc: 34215@debbugs.gnu.org Received: via spool by 34215-submit@debbugs.gnu.org id=B34215.15488682021969 (code B ref 34215); Wed, 30 Jan 2019 17:11:02 +0000 Received: (at 34215) by debbugs.gnu.org; 30 Jan 2019 17:10:02 +0000 Received: from localhost ([127.0.0.1]:51886 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gotNC-0000Vh-BJ for submit@debbugs.gnu.org; Wed, 30 Jan 2019 12:10:02 -0500 Received: from eggs.gnu.org ([209.51.188.92]:47380) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gotNA-0000V9-Ms for 34215@debbugs.gnu.org; Wed, 30 Jan 2019 12:10:01 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:46679) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gotN5-0002yb-BQ; Wed, 30 Jan 2019 12:09:55 -0500 Received: from [176.228.60.248] (port=4690 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1gotN4-0002TL-Uq; Wed, 30 Jan 2019 12:09:55 -0500 Date: Wed, 30 Jan 2019 19:09:47 +0200 Message-Id: <83ftta138k.fsf@gnu.org> From: Eli Zaretskii In-reply-to: <87va27cq35.fsf@ericabrahamsen.net> (message from Eric Abrahamsen on Tue, 29 Jan 2019 09:48:30 -0800) References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> <87a7jmf06v.fsf@ericabrahamsen.net> <83ef8y3r40.fsf@gnu.org> <875zu9gb96.fsf@ericabrahamsen.net> <83a7jl51c5.fsf@gnu.org> <87va27cq35.fsf@ericabrahamsen.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > From: Eric Abrahamsen > Date: Tue, 29 Jan 2019 09:48:30 -0800 > > I've attached a diff adding the conversion function itself, but I'm not > familiar with makefiles and so far haven't been able to figure out how > to call it. It looks like the invocation I want will look like: > > $(AM_V_GEN)${RUN_EMACS} -l titdic-cnv -f pinyin-convert \ > ${srcdir}/MISC-DIC/pinyin.map ${srcdir}/../lisp/language/pinyin.el > > Where ${srcdir} is the leim directory, but I don't actually know how to > get this code called by make... Add a target that is the file produced by this command, then make the above command the recipe of that target. Similar to the ${leimdir}/ja-dic/ja-dic.el target. But if the above doesn't help, someone else could do this part for you. > > I understand, but I wonder if someone could try that for a while and > > see if it makes better input method(s), before we decide to import it. > > FWIW, that mapping is used by the pyim package, which I believe is the > most popular pinyin-based Chinese input method out there. I also use it > via the system-wide input framework fcitx, and it works very well. Then I guess we will be fine importing the new version. > +(defun pinyin-convert () > + "Convert text file pinyin.map into an elisp library. > +The library is named pinyin.el, and contains the constant > +`pinyin-character-map'." This writes out a .el file, but does it encode that file in UTF-8, even if the locale's codeset is something other than UTF-8? If not, you need to bind coding-system-for-write to UTF-8. > + (insert ";; This file is automatically generated from pinyin.map,\ > + by the function pinyin-convert.") This line is too long, suggest to break it in two. > + (insert ")\n\"An alist holding correspondences between pinyin syllables\ > + and Chinese characters.\")\n") Likewise here. Thanks. From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping In-Reply-To: <87imyafyts.fsf@ericabrahamsen.net> Resent-From: Eric Abrahamsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 30 Jan 2019 20:35:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 34215@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.154888045821483 (code B ref -1); Wed, 30 Jan 2019 20:35:02 +0000 Received: (at submit) by debbugs.gnu.org; 30 Jan 2019 20:34:18 +0000 Received: from localhost ([127.0.0.1]:51965 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gowYs-0005aQ-4X for submit@debbugs.gnu.org; Wed, 30 Jan 2019 15:34:18 -0500 Received: from eggs.gnu.org ([209.51.188.92]:60535) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gowYp-0005aD-Qo for submit@debbugs.gnu.org; Wed, 30 Jan 2019 15:34:16 -0500 Received: from lists.gnu.org ([209.51.188.17]:37446) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gowYk-00057M-MC for submit@debbugs.gnu.org; Wed, 30 Jan 2019 15:34:10 -0500 Received: from eggs.gnu.org ([209.51.188.92]:46680) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gowYj-0007JM-Dg for bug-gnu-emacs@gnu.org; Wed, 30 Jan 2019 15:34:10 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_20,RDNS_NONE autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gowYi-00056f-8a for bug-gnu-emacs@gnu.org; Wed, 30 Jan 2019 15:34:09 -0500 Received: from [195.159.176.226] (port=45186 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gowYi-00055w-0G for bug-gnu-emacs@gnu.org; Wed, 30 Jan 2019 15:34:08 -0500 Received: from list by blaine.gmane.org with local (Exim 4.89) (envelope-from ) id 1gowYf-0007MD-AN for bug-gnu-emacs@gnu.org; Wed, 30 Jan 2019 21:34:05 +0100 X-Injected-Via-Gmane: http://gmane.org/ From: Eric Abrahamsen Date: Wed, 30 Jan 2019 12:33:56 -0800 Message-ID: <87munhnavf.fsf@ericabrahamsen.net> References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> <87a7jmf06v.fsf@ericabrahamsen.net> <83ef8y3r40.fsf@gnu.org> <875zu9gb96.fsf@ericabrahamsen.net> <83a7jl51c5.fsf@gnu.org> <87va27cq35.fsf@ericabrahamsen.net> <83ftta138k.fsf@gnu.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cancel-Lock: sha1:vOZZjYtbwLz+W8tP9NHCagQny/Y= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 195.159.176.226 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --=-=-= Content-Type: text/plain Eli Zaretskii writes: >> From: Eric Abrahamsen >> Date: Tue, 29 Jan 2019 09:48:30 -0800 >> >> I've attached a diff adding the conversion function itself, but I'm not >> familiar with makefiles and so far haven't been able to figure out how >> to call it. It looks like the invocation I want will look like: >> >> $(AM_V_GEN)${RUN_EMACS} -l titdic-cnv -f pinyin-convert \ >> ${srcdir}/MISC-DIC/pinyin.map ${srcdir}/../lisp/language/pinyin.el >> >> Where ${srcdir} is the leim directory, but I don't actually know how to >> get this code called by make... > > Add a target that is the file produced by this command, then make the > above command the recipe of that target. Similar to the > ${leimdir}/ja-dic/ja-dic.el target. > > But if the above doesn't help, someone else could do this part for > you. I've attached this as a commit patch -- it seems to work fine but I would appreciate it if you'd check it. >> > I understand, but I wonder if someone could try that for a while and >> > see if it makes better input method(s), before we decide to import it. >> >> FWIW, that mapping is used by the pyim package, which I believe is the >> most popular pinyin-based Chinese input method out there. I also use it >> via the system-wide input framework fcitx, and it works very well. > > Then I guess we will be fine importing the new version. Cool -- I'll file another report for this in a bit. >> +(defun pinyin-convert () >> + "Convert text file pinyin.map into an elisp library. >> +The library is named pinyin.el, and contains the constant >> +`pinyin-character-map'." > > This writes out a .el file, but does it encode that file in UTF-8, > even if the locale's codeset is something other than UTF-8? If not, > you need to bind coding-system-for-write to UTF-8. > >> + (insert ";; This file is automatically generated from pinyin.map,\ >> + by the function pinyin-convert.") > > This line is too long, suggest to break it in two. > >> + (insert ")\n\"An alist holding correspondences between pinyin syllables\ >> + and Chinese characters.\")\n") > > Likewise here. Okay, I've fixed all of the above. Thanks for the pointers. Eric --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001-Make-pinyin-to-Chinese-character-mapping-available-t.patch >From 0aaa67f9717ae10b1dfd1c7f078400989123acb8 Mon Sep 17 00:00:00 2001 From: Eric Abrahamsen Date: Wed, 30 Jan 2019 12:31:49 -0800 Subject: [PATCH] Make pinyin to Chinese character mapping available to elisp * leim/Makefile.in: Build the file pinyin.el from pinyin.map. * lisp/international/titdic-cnv.el (pinyin-convert): New function that writes the library pinyin.el, containing a new constant `pinyin-character-map'. --- leim/Makefile.in | 7 ++++++- lisp/international/titdic-cnv.el | 32 ++++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+), 1 deletion(-) diff --git a/leim/Makefile.in b/leim/Makefile.in index c2fc8c41f2..cd693d6d0d 100644 --- a/leim/Makefile.in +++ b/leim/Makefile.in @@ -84,7 +84,8 @@ MISC= ${leimdir}/quail/PY.el \ ${leimdir}/quail/ZIRANMA.el \ ${leimdir}/quail/CTLau.el \ - ${leimdir}/quail/CTLau-b5.el + ${leimdir}/quail/CTLau-b5.el \ + ${leimdir}/../lisp/language/pinyin.el ## All the generated .el files. TIT_MISC = ${TIT_GB} ${TIT_BIG5} ${MISC} @@ -142,6 +143,10 @@ ${leimdir}/ja-dic/ja-dic.el: $(AM_V_GEN)$(RUN_EMACS) -batch -l ja-dic-cnv \ -f batch-skkdic-convert -dir "$(leimdir)/ja-dic" "$<" +${leimdir}/../lisp/language/pinyin.el: ${srcdir}/MISC-DIC/pinyin.map + $(AM_V_GEN)${RUN_EMACS} -l titdic-cnv -f pinyin-convert \ + ${srcdir}/MISC-DIC/pinyin.map ${srcdir}/../lisp/language/pinyin.el + .PHONY: bootstrap-clean distclean maintainer-clean extraclean diff --git a/lisp/international/titdic-cnv.el b/lisp/international/titdic-cnv.el index 2ce2c527b9..d33e9ff229 100644 --- a/lisp/international/titdic-cnv.el +++ b/lisp/international/titdic-cnv.el @@ -1203,6 +1203,38 @@ batch-miscdic-convert (miscdic-convert filename dir)))) (kill-emacs 0)) +(defun pinyin-convert () + "Convert text file pinyin.map into an elisp library. +The library is named pinyin.el, and contains the constant +`pinyin-character-map'." + (let ((src-file (car command-line-args-left)) + (dst-file (cadr command-line-args-left)) + (coding-system-for-write 'utf-8-emacs)) + (with-temp-file dst-file + (insert ";; This file is automatically generated from pinyin.map,\ + by the\n;; function pinyin-convert.\n\n") + (insert "(defconst pinyin-character-map\n'(") + (let ((pos (point))) + (insert-file-contents src-file) + (goto-char pos) + (re-search-forward "^[a-z]") + (beginning-of-line) + (delete-region pos (point)) + (while (not (eobp)) + (insert "(\"") + (skip-chars-forward "a-z") + (insert "\" . \"") + (delete-char 1) + (end-of-line) + (while (= (preceding-char) ?\r) + (delete-char -1)) + (insert "\")") + (forward-line 1))) + (insert ")\n\"An alist holding correspondences between pinyin syllables\ + and\nChinese characters.\")\n\n") + (insert "(provide 'pinyin)\n")) + (kill-emacs 0))) + ;; Prevent "Local Variables" above confusing Emacs. -- 2.20.1 --=-=-=-- From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping In-Reply-To: <87imyafyts.fsf@ericabrahamsen.net> Resent-From: Eric Abrahamsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 30 Jan 2019 20:49:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 34215@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.154888130322866 (code B ref -1); Wed, 30 Jan 2019 20:49:02 +0000 Received: (at submit) by debbugs.gnu.org; 30 Jan 2019 20:48:23 +0000 Received: from localhost ([127.0.0.1]:51974 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gowmV-0005wj-0H for submit@debbugs.gnu.org; Wed, 30 Jan 2019 15:48:23 -0500 Received: from eggs.gnu.org ([209.51.188.92]:35969) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gowmT-0005wY-OI for submit@debbugs.gnu.org; Wed, 30 Jan 2019 15:48:22 -0500 Received: from lists.gnu.org ([209.51.188.17]:57406) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gowmO-0003dv-Im for submit@debbugs.gnu.org; Wed, 30 Jan 2019 15:48:16 -0500 Received: from eggs.gnu.org ([209.51.188.92]:50342) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gowmN-0007Gw-NP for bug-gnu-emacs@gnu.org; Wed, 30 Jan 2019 15:48:16 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,RDNS_NONE autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gowmM-0003d0-Tq for bug-gnu-emacs@gnu.org; Wed, 30 Jan 2019 15:48:15 -0500 Received: from [195.159.176.226] (port=58186 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gowmM-0003c3-Mv for bug-gnu-emacs@gnu.org; Wed, 30 Jan 2019 15:48:14 -0500 Received: from list by blaine.gmane.org with local (Exim 4.89) (envelope-from ) id 1gowmJ-000NAj-O8 for bug-gnu-emacs@gnu.org; Wed, 30 Jan 2019 21:48:11 +0100 X-Injected-Via-Gmane: http://gmane.org/ From: Eric Abrahamsen Date: Wed, 30 Jan 2019 12:48:06 -0800 Message-ID: <87imy5na7t.fsf@ericabrahamsen.net> References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> <87a7jmf06v.fsf@ericabrahamsen.net> <83ef8y3r40.fsf@gnu.org> <875zu9gb96.fsf@ericabrahamsen.net> <83a7jl51c5.fsf@gnu.org> <87va27cq35.fsf@ericabrahamsen.net> <83ftta138k.fsf@gnu.org> <87munhnavf.fsf@ericabrahamsen.net> Mime-Version: 1.0 Content-Type: text/plain User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cancel-Lock: sha1:cEOIszy85HY4ptSCPM3ZMa/O3gA= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 195.159.176.226 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Eric Abrahamsen writes: > Eli Zaretskii writes: > >>> From: Eric Abrahamsen >>> Date: Tue, 29 Jan 2019 09:48:30 -0800 >>> >>> I've attached a diff adding the conversion function itself, but I'm not >>> familiar with makefiles and so far haven't been able to figure out how >>> to call it. It looks like the invocation I want will look like: >>> >>> $(AM_V_GEN)${RUN_EMACS} -l titdic-cnv -f pinyin-convert \ >>> ${srcdir}/MISC-DIC/pinyin.map ${srcdir}/../lisp/language/pinyin.el >>> >>> Where ${srcdir} is the leim directory, but I don't actually know how to >>> get this code called by make... >> >> Add a target that is the file produced by this command, then make the >> above command the recipe of that target. Similar to the >> ${leimdir}/ja-dic/ja-dic.el target. >> >> But if the above doesn't help, someone else could do this part for >> you. > > I've attached this as a commit patch -- it seems to work fine but I > would appreciate it if you'd check it. Oh, after reading a couple of "make" tutorials, I see maybe the make rule could be simplified to: ${leimdir}/../lisp/language/pinyin.el: ${srcdir}/MISC-DIC/pinyin.map $(AM_V_GEN)${RUN_EMACS} -l titdic-cnv -f pinyin-convert $< $0 From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping Resent-From: Robert Pluim Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 31 Jan 2019 08:52:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eric Abrahamsen Cc: 34215@debbugs.gnu.org Received: via spool by 34215-submit@debbugs.gnu.org id=B34215.154892466410572 (code B ref 34215); Thu, 31 Jan 2019 08:52:02 +0000 Received: (at 34215) by debbugs.gnu.org; 31 Jan 2019 08:51:04 +0000 Received: from localhost ([127.0.0.1]:52150 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gp83s-0002kS-7z for submit@debbugs.gnu.org; Thu, 31 Jan 2019 03:51:04 -0500 Received: from mail-wr1-f42.google.com ([209.85.221.42]:45005) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gp83r-0002jz-5O for 34215@debbugs.gnu.org; Thu, 31 Jan 2019 03:51:03 -0500 Received: by mail-wr1-f42.google.com with SMTP id z5so2275174wrt.11 for <34215@debbugs.gnu.org>; Thu, 31 Jan 2019 00:51:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:mail-copies-to:gmane-reply-to-list :date:in-reply-to:message-id:mime-version; bh=s69QpG1Uz+EX4+Xrx9thpURzgS9Xn+10xdPbX/DunvE=; b=WT0fPY+f1ntXwqwTS7kqFSlo6grLcdSPqlIMaWyGf53O9pqhbAavE/MSQfPK1V53NW nceXI8vmrJhVVPbH2FiY4few6N3IX1NxBnZrqYkxOQmXnl2U8WDJRfswZxbm//PbG4kr B+eVD8Am5TVsDLwJhae72PSIkQLsLjW6vo1IdrgVcNTAk5OOKC/RSsol2s1/mCfX7jmD SLpyI48s1PjRfeEhQ/CRu5g7IYMeSsXb8jCFjPsNbthW96wSiWbVbFLnEPM3fi9llT9r qxSWf8dNtA6qMXm7RlEbDfFyjrzWndPm9DOyxFvPsCu8vmzzE+lDETtsZtgil/cbOcct cnuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:mail-copies-to :gmane-reply-to-list:date:in-reply-to:message-id:mime-version; bh=s69QpG1Uz+EX4+Xrx9thpURzgS9Xn+10xdPbX/DunvE=; b=VM/faMrxF2XsbRI71Yhg/slU4taGcnIY3qNMiOJ9iLo2Es0MGI3p/xMT56U6VAnuwX JSGVJWTmgY4qNWBg9CEdXCR+GZx6dBMlTNaBS6d/D6+zJNibpVMf20dEfPhmliwzf+4F pl0Vr6x0ClcYIkvcXg8bnHH9SgKvndgc+lBT8EKEJx8MFd7nZj8ROJzwYQzK1L5bPnkG Ag1MET/lBiY3Z+aTBRi2TCQmSyj9Q2+wl3U9zQxY15LH1S3XPLIjvD232NUimAlRLCGh RLFnuJnb2Lrg/L0j8W1KVauY8FD8lGrtuoVoxptpXGiWIPg4z/bOdRKaE01ed5vmoK1S rUrg== X-Gm-Message-State: AJcUuketTwFQ0cR5f8SbHuZXodL/L13PNl1VkvOC1kwFacjavsEKvqxZ cOZbt1y8pYG1O3Z4sESYwLY+9fmM X-Google-Smtp-Source: ALg8bN6kvsOgtykcc4XKwdKws1zcbZTcvc99jM6OotaIZR+/CZdODmxGaXNl6Q82J8qLf34Tzi++3A== X-Received: by 2002:a5d:6105:: with SMTP id v5mr32600990wrt.63.1548924655761; Thu, 31 Jan 2019 00:50:55 -0800 (PST) Received: from rpluim-mac ([149.5.228.1]) by smtp.gmail.com with ESMTPSA id y185sm2823724wmg.34.2019.01.31.00.50.54 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Thu, 31 Jan 2019 00:50:54 -0800 (PST) From: Robert Pluim References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> <87a7jmf06v.fsf@ericabrahamsen.net> <83ef8y3r40.fsf@gnu.org> <875zu9gb96.fsf@ericabrahamsen.net> <83a7jl51c5.fsf@gnu.org> <87va27cq35.fsf@ericabrahamsen.net> <83ftta138k.fsf@gnu.org> <87munhnavf.fsf@ericabrahamsen.net> <87imy5na7t.fsf@ericabrahamsen.net> Mail-Copies-To: never Gmane-Reply-To-List: yes Date: Thu, 31 Jan 2019 09:50:54 +0100 In-Reply-To: <87imy5na7t.fsf@ericabrahamsen.net> (Eric Abrahamsen's message of "Wed, 30 Jan 2019 12:48:06 -0800") Message-ID: MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Eric Abrahamsen writes: > > Oh, after reading a couple of "make" tutorials, I see maybe the make > rule could be simplified to: > > ${leimdir}/../lisp/language/pinyin.el: ${srcdir}/MISC-DIC/pinyin.map > $(AM_V_GEN)${RUN_EMACS} -l titdic-cnv -f pinyin-convert $< $0 $@ , I think. Robert From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping In-Reply-To: <87imyafyts.fsf@ericabrahamsen.net> Resent-From: Eric Abrahamsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 31 Jan 2019 19:36:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 34215@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.154896335413760 (code B ref -1); Thu, 31 Jan 2019 19:36:02 +0000 Received: (at submit) by debbugs.gnu.org; 31 Jan 2019 19:35:54 +0000 Received: from localhost ([127.0.0.1]:53365 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gpI7t-0003Zs-IP for submit@debbugs.gnu.org; Thu, 31 Jan 2019 14:35:53 -0500 Received: from eggs.gnu.org ([209.51.188.92]:34209) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gpI7r-0003Zb-Kp for submit@debbugs.gnu.org; Thu, 31 Jan 2019 14:35:52 -0500 Received: from lists.gnu.org ([209.51.188.17]:49650) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gpI7m-0002Lu-CG for submit@debbugs.gnu.org; Thu, 31 Jan 2019 14:35:46 -0500 Received: from eggs.gnu.org ([209.51.188.92]:48577) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gpI7l-0002tB-5v for bug-gnu-emacs@gnu.org; Thu, 31 Jan 2019 14:35:46 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_40,RDNS_NONE autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gpI7j-0002KT-Up for bug-gnu-emacs@gnu.org; Thu, 31 Jan 2019 14:35:45 -0500 Received: from [195.159.176.226] (port=34846 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gpI7j-0002HZ-M4 for bug-gnu-emacs@gnu.org; Thu, 31 Jan 2019 14:35:43 -0500 Received: from list by blaine.gmane.org with local (Exim 4.89) (envelope-from ) id 1gpI7e-000Adz-Ll for bug-gnu-emacs@gnu.org; Thu, 31 Jan 2019 20:35:38 +0100 X-Injected-Via-Gmane: http://gmane.org/ From: Eric Abrahamsen Date: Thu, 31 Jan 2019 11:35:32 -0800 Message-ID: <87a7jghb7f.fsf@ericabrahamsen.net> References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> <87a7jmf06v.fsf@ericabrahamsen.net> <83ef8y3r40.fsf@gnu.org> <875zu9gb96.fsf@ericabrahamsen.net> <83a7jl51c5.fsf@gnu.org> <87va27cq35.fsf@ericabrahamsen.net> <83ftta138k.fsf@gnu.org> <87munhnavf.fsf@ericabrahamsen.net> <87imy5na7t.fsf@ericabrahamsen.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cancel-Lock: sha1:P+KmbKtW87hyte99cJOzvrCfV2o= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 195.159.176.226 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --=-=-= Content-Type: text/plain Robert Pluim writes: > Eric Abrahamsen writes: > >> >> Oh, after reading a couple of "make" tutorials, I see maybe the make >> rule could be simplified to: >> >> ${leimdir}/../lisp/language/pinyin.el: ${srcdir}/MISC-DIC/pinyin.map >> $(AM_V_GEN)${RUN_EMACS} -l titdic-cnv -f pinyin-convert $< $0 > > $@ , I think. Ah, right you are, thanks. I was wondering why that wasn't working. This version should do the trick; it also gitignores the generated file. Eric --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001-Make-pinyin-to-Chinese-character-mapping-available-t.patch >From 2395d1a62e66206c04b7069d372fdb4f10787863 Mon Sep 17 00:00:00 2001 From: Eric Abrahamsen Date: Wed, 30 Jan 2019 12:31:49 -0800 Subject: [PATCH 1/2] Make pinyin to Chinese character mapping available to elisp * leim/Makefile.in: Build the file pinyin.el from pinyin.map. * lisp/international/titdic-cnv.el (pinyin-convert): New function that writes the library pinyin.el, containing a new constant `pinyin-character-map'. * .gitignore: Ignore the generated pinyin.el file. --- .gitignore | 1 + leim/Makefile.in | 6 +++++- lisp/international/titdic-cnv.el | 32 ++++++++++++++++++++++++++++++++ 3 files changed, 38 insertions(+), 1 deletion(-) diff --git a/.gitignore b/.gitignore index 53f41f0f3e..f3d5ccb0f8 100644 --- a/.gitignore +++ b/.gitignore @@ -199,6 +199,7 @@ lisp/international/charscript.el lisp/international/cp51932.el lisp/international/eucjp-ms.el lisp/international/uni-*.el +lisp/language/pinyin.el # Documentation. *.aux diff --git a/leim/Makefile.in b/leim/Makefile.in index c2fc8c41f2..4307d50087 100644 --- a/leim/Makefile.in +++ b/leim/Makefile.in @@ -84,7 +84,8 @@ MISC= ${leimdir}/quail/PY.el \ ${leimdir}/quail/ZIRANMA.el \ ${leimdir}/quail/CTLau.el \ - ${leimdir}/quail/CTLau-b5.el + ${leimdir}/quail/CTLau-b5.el \ + ${srcdir}/../lisp/language/pinyin.el ## All the generated .el files. TIT_MISC = ${TIT_GB} ${TIT_BIG5} ${MISC} @@ -142,6 +143,9 @@ ${leimdir}/ja-dic/ja-dic.el: $(AM_V_GEN)$(RUN_EMACS) -batch -l ja-dic-cnv \ -f batch-skkdic-convert -dir "$(leimdir)/ja-dic" "$<" +${srcdir}/../lisp/language/pinyin.el: ${srcdir}/MISC-DIC/pinyin.map + $(AM_V_GEN)${RUN_EMACS} -l titdic-cnv -f pinyin-convert $< $@ + .PHONY: bootstrap-clean distclean maintainer-clean extraclean diff --git a/lisp/international/titdic-cnv.el b/lisp/international/titdic-cnv.el index 2ce2c527b9..d33e9ff229 100644 --- a/lisp/international/titdic-cnv.el +++ b/lisp/international/titdic-cnv.el @@ -1203,6 +1203,38 @@ batch-miscdic-convert (miscdic-convert filename dir)))) (kill-emacs 0)) +(defun pinyin-convert () + "Convert text file pinyin.map into an elisp library. +The library is named pinyin.el, and contains the constant +`pinyin-character-map'." + (let ((src-file (car command-line-args-left)) + (dst-file (cadr command-line-args-left)) + (coding-system-for-write 'utf-8-emacs)) + (with-temp-file dst-file + (insert ";; This file is automatically generated from pinyin.map,\ + by the\n;; function pinyin-convert.\n\n") + (insert "(defconst pinyin-character-map\n'(") + (let ((pos (point))) + (insert-file-contents src-file) + (goto-char pos) + (re-search-forward "^[a-z]") + (beginning-of-line) + (delete-region pos (point)) + (while (not (eobp)) + (insert "(\"") + (skip-chars-forward "a-z") + (insert "\" . \"") + (delete-char 1) + (end-of-line) + (while (= (preceding-char) ?\r) + (delete-char -1)) + (insert "\")") + (forward-line 1))) + (insert ")\n\"An alist holding correspondences between pinyin syllables\ + and\nChinese characters.\")\n\n") + (insert "(provide 'pinyin)\n")) + (kill-emacs 0))) + ;; Prevent "Local Variables" above confusing Emacs. -- 2.20.1 --=-=-=-- From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 01 Feb 2019 09:50:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eric Abrahamsen Cc: 34215@debbugs.gnu.org Received: via spool by 34215-submit@debbugs.gnu.org id=B34215.154901454324966 (code B ref 34215); Fri, 01 Feb 2019 09:50:02 +0000 Received: (at 34215) by debbugs.gnu.org; 1 Feb 2019 09:49:03 +0000 Received: from localhost ([127.0.0.1]:53741 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gpVRX-0006Uc-Ay for submit@debbugs.gnu.org; Fri, 01 Feb 2019 04:49:03 -0500 Received: from eggs.gnu.org ([209.51.188.92]:41430) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gpVRW-0006U9-2N for 34215@debbugs.gnu.org; Fri, 01 Feb 2019 04:49:02 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:55266) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gpVRQ-0004HP-Pr; Fri, 01 Feb 2019 04:48:56 -0500 Received: from [176.228.60.248] (port=4132 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1gpVRQ-0007wb-DY; Fri, 01 Feb 2019 04:48:56 -0500 Date: Fri, 01 Feb 2019 11:48:52 +0200 Message-Id: <83tvhnyh2z.fsf@gnu.org> From: Eli Zaretskii In-reply-to: <87a7jghb7f.fsf@ericabrahamsen.net> (message from Eric Abrahamsen on Thu, 31 Jan 2019 11:35:32 -0800) References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> <87a7jmf06v.fsf@ericabrahamsen.net> <83ef8y3r40.fsf@gnu.org> <875zu9gb96.fsf@ericabrahamsen.net> <83a7jl51c5.fsf@gnu.org> <87va27cq35.fsf@ericabrahamsen.net> <83ftta138k.fsf@gnu.org> <87munhnavf.fsf@ericabrahamsen.net> <87imy5na7t.fsf@ericabrahamsen.net> <87a7jghb7f.fsf@ericabrahamsen.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > From: Eric Abrahamsen > Date: Thu, 31 Jan 2019 11:35:32 -0800 > > +(defun pinyin-convert () > + "Convert text file pinyin.map into an elisp library. > +The library is named pinyin.el, and contains the constant > +`pinyin-character-map'." > + (let ((src-file (car command-line-args-left)) > + (dst-file (cadr command-line-args-left)) > + (coding-system-for-write 'utf-8-emacs)) This should be 'utf-8-unix. There's no reason to write out stuff in our internal encoding, as the file is not supposed to have any characters not representable in UTF-8. Otherwise, this LGTM. Let's wait for a few days for more comments, and then push. Thanks. From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping In-Reply-To: <87imyafyts.fsf@ericabrahamsen.net> Resent-From: Eric Abrahamsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 01 Feb 2019 16:28:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 34215@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.154903844816187 (code B ref -1); Fri, 01 Feb 2019 16:28:01 +0000 Received: (at submit) by debbugs.gnu.org; 1 Feb 2019 16:27:28 +0000 Received: from localhost ([127.0.0.1]:55191 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gpbf6-0004D1-4X for submit@debbugs.gnu.org; Fri, 01 Feb 2019 11:27:28 -0500 Received: from eggs.gnu.org ([209.51.188.92]:49933) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gpbf4-0004Cn-1d for submit@debbugs.gnu.org; Fri, 01 Feb 2019 11:27:26 -0500 Received: from lists.gnu.org ([209.51.188.17]:58666) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gpbey-0003Rs-Tf for submit@debbugs.gnu.org; Fri, 01 Feb 2019 11:27:20 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36082) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gpbey-0008HJ-0D for bug-gnu-emacs@gnu.org; Fri, 01 Feb 2019 11:27:20 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,RDNS_NONE autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gpbew-0003RC-EC for bug-gnu-emacs@gnu.org; Fri, 01 Feb 2019 11:27:19 -0500 Received: from [195.159.176.226] (port=33764 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gpbew-0003QP-7b for bug-gnu-emacs@gnu.org; Fri, 01 Feb 2019 11:27:18 -0500 Received: from list by blaine.gmane.org with local (Exim 4.89) (envelope-from ) id 1gpbet-000qrY-VW for bug-gnu-emacs@gnu.org; Fri, 01 Feb 2019 17:27:15 +0100 X-Injected-Via-Gmane: http://gmane.org/ From: Eric Abrahamsen Date: Fri, 01 Feb 2019 08:27:08 -0800 Message-ID: <87imy3o4o3.fsf@ericabrahamsen.net> References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> <87a7jmf06v.fsf@ericabrahamsen.net> <83ef8y3r40.fsf@gnu.org> <875zu9gb96.fsf@ericabrahamsen.net> <83a7jl51c5.fsf@gnu.org> <87va27cq35.fsf@ericabrahamsen.net> <83ftta138k.fsf@gnu.org> <87munhnavf.fsf@ericabrahamsen.net> <87imy5na7t.fsf@ericabrahamsen.net> <87a7jghb7f.fsf@ericabrahamsen.net> <83tvhnyh2z.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cancel-Lock: sha1:l1AkjkfSwWzon7WKiisCGqxCo1k= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 195.159.176.226 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Eli Zaretskii writes: >> From: Eric Abrahamsen >> Date: Thu, 31 Jan 2019 11:35:32 -0800 >> >> +(defun pinyin-convert () >> + "Convert text file pinyin.map into an elisp library. >> +The library is named pinyin.el, and contains the constant >> +`pinyin-character-map'." >> + (let ((src-file (car command-line-args-left)) >> + (dst-file (cadr command-line-args-left)) >> + (coding-system-for-write 'utf-8-emacs)) > > This should be 'utf-8-unix. There's no reason to write out stuff in > our internal encoding, as the file is not supposed to have any > characters not representable in UTF-8. Oh, okay. For my information -- is that not platform-dependent? I noticed titdic-cnv.el has a utf-8-emacs encoding cookie at the top. > Otherwise, this LGTM. Let's wait for a few days for more comments, > and then push. Sure thing. From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 01 Feb 2019 18:55:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eric Abrahamsen Cc: 34215@debbugs.gnu.org Received: via spool by 34215-submit@debbugs.gnu.org id=B34215.154904725130879 (code B ref 34215); Fri, 01 Feb 2019 18:55:02 +0000 Received: (at 34215) by debbugs.gnu.org; 1 Feb 2019 18:54:11 +0000 Received: from localhost ([127.0.0.1]:55267 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gpdx4-00081y-VX for submit@debbugs.gnu.org; Fri, 01 Feb 2019 13:54:11 -0500 Received: from eggs.gnu.org ([209.51.188.92]:47369) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gpdx3-00081l-G8 for 34215@debbugs.gnu.org; Fri, 01 Feb 2019 13:54:09 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:48169) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gpdwq-00040i-3J; Fri, 01 Feb 2019 13:53:58 -0500 Received: from [176.228.60.248] (port=2132 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1gpdwp-0003Y5-NM; Fri, 01 Feb 2019 13:53:56 -0500 Date: Fri, 01 Feb 2019 20:53:39 +0200 Message-Id: <837eejxrv0.fsf@gnu.org> From: Eli Zaretskii In-reply-to: <87imy3o4o3.fsf@ericabrahamsen.net> (message from Eric Abrahamsen on Fri, 01 Feb 2019 08:27:08 -0800) References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> <87a7jmf06v.fsf@ericabrahamsen.net> <83ef8y3r40.fsf@gnu.org> <875zu9gb96.fsf@ericabrahamsen.net> <83a7jl51c5.fsf@gnu.org> <87va27cq35.fsf@ericabrahamsen.net> <83ftta138k.fsf@gnu.org> <87munhnavf.fsf@ericabrahamsen.net> <87imy5na7t.fsf@ericabrahamsen.net> <87a7jghb7f.fsf@ericabrahamsen.net> <83tvhnyh2z.fsf@gnu.org> <87imy3o4o3.fsf@ericabrahamsen.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > From: Eric Abrahamsen > Date: Fri, 01 Feb 2019 08:27:08 -0800 > > >> + (coding-system-for-write 'utf-8-emacs)) > > > > This should be 'utf-8-unix. There's no reason to write out stuff in > > our internal encoding, as the file is not supposed to have any > > characters not representable in UTF-8. > > Oh, okay. For my information -- is that not platform-dependent? No, the defaults are platform-dependent. utf-8-unix is an explicit specification of an encoding, so it leaves nothing to the platform. > I noticed titdic-cnv.el has a utf-8-emacs encoding cookie at the > top. utf-8-emacs is the internal representation of characters used by Emacs, it should only be used when some of the characters might not be expressible in UTF-8 (i.e. they are beyond the Unicode codespace). From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping In-Reply-To: <87imyafyts.fsf@ericabrahamsen.net> Resent-From: Eric Abrahamsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 01 Feb 2019 19:16:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 34215@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.1549048539449 (code B ref -1); Fri, 01 Feb 2019 19:16:02 +0000 Received: (at submit) by debbugs.gnu.org; 1 Feb 2019 19:15:39 +0000 Received: from localhost ([127.0.0.1]:55272 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gpeHq-00007A-Qo for submit@debbugs.gnu.org; Fri, 01 Feb 2019 14:15:39 -0500 Received: from eggs.gnu.org ([209.51.188.92]:55243) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gpeHo-00006v-Eg for submit@debbugs.gnu.org; Fri, 01 Feb 2019 14:15:37 -0500 Received: from lists.gnu.org ([209.51.188.17]:58960) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gpeHf-00008A-83 for submit@debbugs.gnu.org; Fri, 01 Feb 2019 14:15:28 -0500 Received: from eggs.gnu.org ([209.51.188.92]:41379) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gpeHe-00027l-9k for bug-gnu-emacs@gnu.org; Fri, 01 Feb 2019 14:15:27 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,RDNS_NONE autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gpeHZ-000054-2L for bug-gnu-emacs@gnu.org; Fri, 01 Feb 2019 14:15:23 -0500 Received: from [195.159.176.226] (port=51084 helo=blaine.gmane.org) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gpeHW-0008TZ-Dp for bug-gnu-emacs@gnu.org; Fri, 01 Feb 2019 14:15:20 -0500 Received: from list by blaine.gmane.org with local (Exim 4.89) (envelope-from ) id 1gpeHR-000hLJ-Nd for bug-gnu-emacs@gnu.org; Fri, 01 Feb 2019 20:15:13 +0100 X-Injected-Via-Gmane: http://gmane.org/ From: Eric Abrahamsen Date: Fri, 01 Feb 2019 11:15:08 -0800 Message-ID: <87k1ijmibn.fsf@ericabrahamsen.net> References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> <87a7jmf06v.fsf@ericabrahamsen.net> <83ef8y3r40.fsf@gnu.org> <875zu9gb96.fsf@ericabrahamsen.net> <83a7jl51c5.fsf@gnu.org> <87va27cq35.fsf@ericabrahamsen.net> <83ftta138k.fsf@gnu.org> <87munhnavf.fsf@ericabrahamsen.net> <87imy5na7t.fsf@ericabrahamsen.net> <87a7jghb7f.fsf@ericabrahamsen.net> <83tvhnyh2z.fsf@gnu.org> <87imy3o4o3.fsf@ericabrahamsen.net> <837eejxrv0.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cancel-Lock: sha1:qjUtjvgEleeiAOMisbsavrCjW7s= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 195.159.176.226 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Eli Zaretskii writes: >> From: Eric Abrahamsen >> Date: Fri, 01 Feb 2019 08:27:08 -0800 >> >> >> + (coding-system-for-write 'utf-8-emacs)) >> > >> > This should be 'utf-8-unix. There's no reason to write out stuff in >> > our internal encoding, as the file is not supposed to have any >> > characters not representable in UTF-8. >> >> Oh, okay. For my information -- is that not platform-dependent? > > No, the defaults are platform-dependent. utf-8-unix is an explicit > specification of an encoding, so it leaves nothing to the platform. > >> I noticed titdic-cnv.el has a utf-8-emacs encoding cookie at the >> top. > > utf-8-emacs is the internal representation of characters used by > Emacs, it should only be used when some of the characters might not be > expressible in UTF-8 (i.e. they are beyond the Unicode codespace). Interesting, thank you for this background. From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping Resent-From: Eric Abrahamsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 24 Feb 2019 05:37:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 34215@debbugs.gnu.org Received: via spool by 34215-submit@debbugs.gnu.org id=B34215.155098658310181 (code B ref 34215); Sun, 24 Feb 2019 05:37:02 +0000 Received: (at 34215) by debbugs.gnu.org; 24 Feb 2019 05:36:23 +0000 Received: from localhost ([127.0.0.1]:49758 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gxmSd-0002e9-2c for submit@debbugs.gnu.org; Sun, 24 Feb 2019 00:36:23 -0500 Received: from ericabrahamsen.net ([52.70.2.18]:40436 helo=mail.ericabrahamsen.net) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gxmSZ-0002dt-BO for 34215@debbugs.gnu.org; Sun, 24 Feb 2019 00:36:20 -0500 Received: from localhost (unknown [172.58.46.144]) (Authenticated sender: eric@ericabrahamsen.net) by mail.ericabrahamsen.net (Postfix) with ESMTPSA id 99A4EFA02F for <34215@debbugs.gnu.org>; Sun, 24 Feb 2019 05:36:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ericabrahamsen.net; s=mail; t=1550986573; bh=vHlze+YUwbRrUpNGNe3YVV4qXyyQGNh2X/Ik77v+Kdc=; h=From:To:Subject:References:Date:In-Reply-To:From; b=KPYhI0ygF0yHCRwO0iQWQf98Z8Y/qMG+B3nhoKhpsBcejCvKDyqodIdT4j+lKSRZD LaszzXqehfaK6VsgzDHwI1qh2hKUClTN7/RGZpum6UDWBOxusEMoxfflVV1h+LlxjM TX7K1p2YMdE27RV+pGdPiVdBlevrIyG1e0TKMnJA= From: Eric Abrahamsen References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> <87a7jmf06v.fsf@ericabrahamsen.net> <83ef8y3r40.fsf@gnu.org> <875zu9gb96.fsf@ericabrahamsen.net> <83a7jl51c5.fsf@gnu.org> <87va27cq35.fsf@ericabrahamsen.net> <83ftta138k.fsf@gnu.org> <87munhnavf.fsf@ericabrahamsen.net> <87imy5na7t.fsf@ericabrahamsen.net> <87a7jghb7f.fsf@ericabrahamsen.net> <83tvhnyh2z.fsf@gnu.org> Date: Sat, 23 Feb 2019 21:36:10 -0800 In-Reply-To: <83tvhnyh2z.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 01 Feb 2019 11:48:52 +0200") Message-ID: <87ftsd3fzp.fsf@ericabrahamsen.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On 02/01/19 11:48 AM, Eli Zaretskii wrote: >> From: Eric Abrahamsen >> Date: Thu, 31 Jan 2019 11:35:32 -0800 >> >> +(defun pinyin-convert () >> + "Convert text file pinyin.map into an elisp library. >> +The library is named pinyin.el, and contains the constant >> +`pinyin-character-map'." >> + (let ((src-file (car command-line-args-left)) >> + (dst-file (cadr command-line-args-left)) >> + (coding-system-for-write 'utf-8-emacs)) > > This should be 'utf-8-unix. There's no reason to write out stuff in > our internal encoding, as the file is not supposed to have any > characters not representable in UTF-8. > > Otherwise, this LGTM. Let's wait for a few days for more comments, > and then push. Doesn't look like anything more is forthcoming, shall I push to master? From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 24 Feb 2019 16:07:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eric Abrahamsen Cc: 34215@debbugs.gnu.org Received: via spool by 34215-submit@debbugs.gnu.org id=B34215.155102437411180 (code B ref 34215); Sun, 24 Feb 2019 16:07:02 +0000 Received: (at 34215) by debbugs.gnu.org; 24 Feb 2019 16:06:14 +0000 Received: from localhost ([127.0.0.1]:50353 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gxwIA-0002uG-8E for submit@debbugs.gnu.org; Sun, 24 Feb 2019 11:06:14 -0500 Received: from eggs.gnu.org ([209.51.188.92]:47461) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gxwI8-0002tx-4D for 34215@debbugs.gnu.org; Sun, 24 Feb 2019 11:06:12 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:42028) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gxwI1-0005nV-0k; Sun, 24 Feb 2019 11:06:06 -0500 Received: from [176.228.60.248] (port=3975 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1gxwI0-0001zo-Ki; Sun, 24 Feb 2019 11:06:04 -0500 Date: Sun, 24 Feb 2019 18:06:13 +0200 Message-Id: <834l8tp3wq.fsf@gnu.org> From: Eli Zaretskii In-reply-to: <87ftsd3fzp.fsf@ericabrahamsen.net> (message from Eric Abrahamsen on Sat, 23 Feb 2019 21:36:10 -0800) References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> <87a7jmf06v.fsf@ericabrahamsen.net> <83ef8y3r40.fsf@gnu.org> <875zu9gb96.fsf@ericabrahamsen.net> <83a7jl51c5.fsf@gnu.org> <87va27cq35.fsf@ericabrahamsen.net> <83ftta138k.fsf@gnu.org> <87munhnavf.fsf@ericabrahamsen.net> <87imy5na7t.fsf@ericabrahamsen.net> <87a7jghb7f.fsf@ericabrahamsen.net> <83tvhnyh2z.fsf@gnu.org> <87ftsd3fzp.fsf@ericabrahamsen.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > From: Eric Abrahamsen > Date: Sat, 23 Feb 2019 21:36:10 -0800 > > > Otherwise, this LGTM. Let's wait for a few days for more comments, > > and then push. > > Doesn't look like anything more is forthcoming, shall I push to master? Yes, please. From unknown Fri Jun 13 11:35:59 2025 X-Loop: help-debbugs@gnu.org Subject: bug#34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping Resent-From: Eric Abrahamsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 24 Feb 2019 18:55:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 34215 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 34215@debbugs.gnu.org Received: via spool by 34215-submit@debbugs.gnu.org id=B34215.155103444827229 (code B ref 34215); Sun, 24 Feb 2019 18:55:01 +0000 Received: (at 34215) by debbugs.gnu.org; 24 Feb 2019 18:54:08 +0000 Received: from localhost ([127.0.0.1]:50492 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gxyud-000756-7a for submit@debbugs.gnu.org; Sun, 24 Feb 2019 13:54:08 -0500 Received: from ericabrahamsen.net ([52.70.2.18]:41386 helo=mail.ericabrahamsen.net) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gxyua-00074c-Nh for 34215@debbugs.gnu.org; Sun, 24 Feb 2019 13:54:06 -0500 Received: from localhost (67-40-27-198.tukw.qwest.net [67.40.27.198]) (Authenticated sender: eric@ericabrahamsen.net) by mail.ericabrahamsen.net (Postfix) with ESMTPSA id 0CB48FA03D; Sun, 24 Feb 2019 18:53:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ericabrahamsen.net; s=mail; t=1551034439; bh=MSBeAV9MhVdUzE9z38c3eW0HoUktZ3Ce+HWAekEuAGA=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=N/aJh7h67B5yYgvAoyA8h73NKShcqkEgp13YbnfRwOOfSh0tdeefUPAeLkGY1y2js uPRez5ibuSgdxudHseG+hF4eGOnOf8QiEHSswNQh8/liHTaXAOf0zeeFNhBQJVDewV SiqkjG/jz+brTIMcGwXCP+mbMLjFU5TYrQ3g/S5Q= From: Eric Abrahamsen References: <87imyafyts.fsf@ericabrahamsen.net> <83r2cy3y69.fsf@gnu.org> <87a7jmf06v.fsf@ericabrahamsen.net> <83ef8y3r40.fsf@gnu.org> <875zu9gb96.fsf@ericabrahamsen.net> <83a7jl51c5.fsf@gnu.org> <87va27cq35.fsf@ericabrahamsen.net> <83ftta138k.fsf@gnu.org> <87munhnavf.fsf@ericabrahamsen.net> <87imy5na7t.fsf@ericabrahamsen.net> <87a7jghb7f.fsf@ericabrahamsen.net> <83tvhnyh2z.fsf@gnu.org> <87ftsd3fzp.fsf@ericabrahamsen.net> <834l8tp3wq.fsf@gnu.org> Date: Sun, 24 Feb 2019 10:53:57 -0800 In-Reply-To: <834l8tp3wq.fsf@gnu.org> (Eli Zaretskii's message of "Sun, 24 Feb 2019 18:06:13 +0200") Message-ID: <87r2bx10hm.fsf@ericabrahamsen.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On 02/24/19 18:06 PM, Eli Zaretskii wrote: >> From: Eric Abrahamsen >> Date: Sat, 23 Feb 2019 21:36:10 -0800 >> >> > Otherwise, this LGTM. Let's wait for a few days for more comments, >> > and then push. >> >> Doesn't look like anything more is forthcoming, shall I push to master? > > Yes, please. Done, thanks. From unknown Fri Jun 13 11:35:59 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Eric Abrahamsen Subject: bug#34215: closed () Message-ID: References: <87bm310zn3.fsf@ericabrahamsen.net> <87imyafyts.fsf@ericabrahamsen.net> X-Gnu-PR-Message: they-closed 34215 X-Gnu-PR-Package: emacs Reply-To: 34215@debbugs.gnu.org Date: Sun, 24 Feb 2019 19:13:04 +0000 Content-Type: multipart/mixed; boundary="----------=_1551035584-29023-1" This is a multi-part message in MIME format... ------------=_1551035584-29023-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #34215: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping which was filed against the emacs package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 34215@debbugs.gnu.org. --=20 34215: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D34215 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1551035584-29023-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 34215-done) by debbugs.gnu.org; 24 Feb 2019 19:12:26 +0000 Received: from localhost ([127.0.0.1]:50517 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gxzCM-0007X7-9E for submit@debbugs.gnu.org; Sun, 24 Feb 2019 14:12:26 -0500 Received: from ericabrahamsen.net ([52.70.2.18]:41458 helo=mail.ericabrahamsen.net) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gxzCK-0007Ws-6e for 34215-done@debbugs.gnu.org; Sun, 24 Feb 2019 14:12:24 -0500 Received: from localhost (67-40-27-198.tukw.qwest.net [67.40.27.198]) (Authenticated sender: eric@ericabrahamsen.net) by mail.ericabrahamsen.net (Postfix) with ESMTPSA id 63F18FA03D for <34215-done@debbugs.gnu.org>; Sun, 24 Feb 2019 19:12:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ericabrahamsen.net; s=mail; t=1551035538; bh=frcCV1k9oG9oKj3dpUqdJg1PxRT2RSN/XKdLCPjaYaY=; h=From:To:Subject:Date:From; b=R0VIWLwr5lK8WwOuJwGo/gRAwkEv8yRSUVvk6a+cDZpKhmSD3oRW/qANuZxb0KqBt BYdIoUrOW+RTeMyq22ExSjTYY0JMO7Qtx/MnO05WfqkDkJtiA++kKaZvJLbqKYRPtR PU0uEAWCEfxV/rPNbN1k8jZLxUYS8EF6nRG98ojg= From: Eric Abrahamsen To: 34215-done@debbugs.gnu.org Subject: Date: Sun, 24 Feb 2019 11:12:16 -0800 Message-ID: <87bm310zn3.fsf@ericabrahamsen.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Score: 4.3 (++++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Content analysis details: (4.3 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: ericabrahamsen.net] -0.0 SPF_PASS SPF: sender matches SPF record 2.0 BLANK_SUBJECT Subject is present but empty 2.3 EMPTY_MESSAGE Message appears to have no textual parts and no Subject: text X-Debbugs-Envelope-To: 34215-done X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 3.3 (+++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Content analysis details: (3.3 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: ericabrahamsen.net] -0.0 SPF_PASS SPF: sender matches SPF record -1.0 MAILING_LIST_MULTI Multiple indicators imply a widely-seen list manager 2.0 BLANK_SUBJECT Subject is present but empty 2.3 EMPTY_MESSAGE Message appears to have no textual parts and no Subject: text ------------=_1551035584-29023-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 27 Jan 2019 05:43:35 +0000 Received: from localhost ([127.0.0.1]:46925 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gndEE-0007jc-Vj for submit@debbugs.gnu.org; Sun, 27 Jan 2019 00:43:35 -0500 Received: from eggs.gnu.org ([209.51.188.92]:38011) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gndED-0007jO-OM for submit@debbugs.gnu.org; Sun, 27 Jan 2019 00:43:34 -0500 Received: from lists.gnu.org ([209.51.188.17]:53030) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gndE8-0005na-CK for submit@debbugs.gnu.org; Sun, 27 Jan 2019 00:43:28 -0500 Received: from eggs.gnu.org ([209.51.188.92]:48156) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gndE7-0003TG-Ax for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 00:43:28 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_50,RCVD_IN_DNSWL_MED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gnd5n-00078y-J9 for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 00:34:52 -0500 Received: from mail.ericabrahamsen.net ([50.56.99.223]:41691) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gnd5n-00070g-AR for bug-gnu-emacs@gnu.org; Sun, 27 Jan 2019 00:34:51 -0500 Received: from localhost (71-212-20-199.tukw.qwest.net [71.212.20.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) (Authenticated sender: eric@ericabrahamsen.net) by mail.ericabrahamsen.net (Postfix) with ESMTPSA id 704B13FB4D for ; Sun, 27 Jan 2019 05:34:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mail.ericabrahamsen.net; s=mail; t=1548567281; bh=hATpaYe8b5gfblQkICWNiyyFL7jF0VR8OY9FHuU4TfA=; h=From:To:Subject:Date:From; b=HAsjdzu51V8s45m5iapxQWt0meCNtuM9SX2wk8evWls+5UPAr6Sum5WyG+EFz7zNd H2Y4xEw7JClZuK7lsV7RXQClSMLVoxewhx2oNYnsbxfuOpf5JfnyzxLuxso2RNDTqC v8Jtc0218f81O5ylWyUrlRS+IuwXnXEnyp7PshZo= From: Eric Abrahamsen To: bug-gnu-emacs@gnu.org Subject: 27.0.50; Provide elisp access to Chinese pinyin-to-character mapping Date: Sat, 26 Jan 2019 21:34:39 -0800 Message-ID: <87imyafyts.fsf@ericabrahamsen.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 50.56.99.223 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: 0.9 (/) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.1 (/) --=-=-= Content-Type: text/plain This bug report is apropos to this[1] emacs.devel thread. The basic idea is that in the Emacs sources there's a file containing a mapping between pinyin -- the most common Chinese romanization system -- and Chinese characters themselves. The mapping lives in leim/MISC-DIC/pinyin.map, and is converted into a quail input method by the `py-converter' function in titdic-cnv.el, which is part of the "make" process. I want this mapping to be available to elisp code in general, because it's useful for all kinds of other language utilities (searching Chinese characters using ascii letters, etc). pinyin.map is a plain text file, each line consisting of a romanized syllable, a TAB, and a string of the possible corresponding Chinese characters. `titdic-convert' parses this and feeds it to `quail-define-rules'. My first thought was to add an intermediate step, where `titdic-convert' first composes an alist, then feeds that alist to `quail-define-rules', which would also allow us access to the alist. The more I looked at it, the more hacky and awkward that approached seemed, and it's not like it would save any memory: you still end up with the data both in a quail method, and in a separate alist. So this proposed patch simply parses the same file in the same way, but in a different location. I've put it in china-util.el, but chinese.el would also be a reasonable spot. Both those files are concerned with encoding, but at least "china-util" gives the impression that it could be a grab-bag. I'm not sure this use of `source-directory' is particularly robust, but I don't know how else to handle it. Hope this will be considered! Eric [1]: https://lists.gnu.org/archive/html/emacs-devel/2019-01/msg00306.html --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=0001-New-constant-chinese-pinyin-character-map.patch >From f63b918057f7eaf6f8eebb28071ac17dd5ab3ff1 Mon Sep 17 00:00:00 2001 From: Eric Abrahamsen Date: Sat, 26 Jan 2019 20:11:23 -0800 Subject: [PATCH] New constant chinese-pinyin-character-map * lisp/language/china-util.el (chinese-pinyin-character-map): Constant holding an alist built from the pinyin-to-character mapping provided in the file pinyin.map. --- lisp/language/china-util.el | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/lisp/language/china-util.el b/lisp/language/china-util.el index 70710bac18..cdbd8e322f 100644 --- a/lisp/language/china-util.el +++ b/lisp/language/china-util.el @@ -30,7 +30,7 @@ ;;; Code: -;; Hz/ZW/EUC-TW encoding stuff +;; Hz/ZW/EUC-TW encoding stuff, also a pinyin-to-character mapping. ;; HZ is an encoding method for Chinese character set GB2312 used ;; widely in Internet. It is very similar to 7-bit environment of @@ -202,6 +202,30 @@ pre-write-encode-hz (let (last-coding-system-used) (encode-hz-region 1 (point-max))) nil)) + +;;; Elisp-accessible version of the pinyin-to-character mapping +;;; provided in leim/MISC-DIC/pinyin.map, which is otherwise only +;;; exposed to the quail input method. + +(eval-and-compile + (defconst chinese-pinyin-character-map + (let ((py-file (expand-file-name + "leim/MISC-DIC/pinyin.map" + source-directory)) + alst) + (with-temp-buffer + (insert-file-contents py-file) + (re-search-forward "^[^%]" (point-max) t) + (beginning-of-line) + (while (re-search-forward "^\\([[:ascii:]]+\\)\t\\(\\cc+\\)$" + (point-max) t) + (push (cons (match-string-no-properties 1) + (match-string-no-properties 2)) +alst)) + (nreverse alst))) + "An alist mapping pinyin syllables to Chinese characters. +Produced from data in pinyin.map.")) + ;; (provide 'china-util) -- 2.20.1 --=-=-=-- ------------=_1551035584-29023-1--