From unknown Sat Aug 16 16:23:28 2025 X-Loop: don@donarmstrong.com Subject: bug#1399: 23.0.60; Some Unicode glyphs incorrectly mapped to CJK Reply-To: Ian Eure , 1399@debbugs.gnu.org Resent-From: Ian Eure Resent-To: bug-submit-list@lists.donarmstrong.com Resent-CC: Emacs Bugs Resent-Date: Thu, 20 Nov 2008 23:40:04 +0000 Resent-Message-ID: Resent-Sender: don@donarmstrong.com X-Emacs-PR-Message: report 1399 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Received: via spool by submit@emacsbugs.donarmstrong.com id=B.122722403417098 (code B ref -1); Thu, 20 Nov 2008 23:40:04 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.3-bugs.debian.org_2005_01_02 (2007-08-08) on rzlab.ucr.edu X-Spam-Level: X-Spam-Status: No, score=-7.9 required=4.0 tests=BAYES_00,FOURLA, RCVD_IN_DNSWL_MED autolearn=ham version=3.2.3-bugs.debian.org_2005_01_02 Received: (at submit) by emacsbugs.donarmstrong.com; 20 Nov 2008 23:33:54 +0000 Received: from fencepost.gnu.org (fencepost.gnu.org [140.186.70.10]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id mAKNXpFo017092 for ; Thu, 20 Nov 2008 15:33:52 -0800 Received: from mx10.gnu.org ([199.232.76.166]:43918) by fencepost.gnu.org with esmtp (Exim 4.67) (envelope-from ) id 1L3J1o-0004wZ-B3 for emacs-pretest-bug@gnu.org; Thu, 20 Nov 2008 18:33:40 -0500 Received: from mail.digg.com ([64.191.203.36]:44524) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1L3J1y-0005KO-G6 for emacs-pretest-bug@gnu.org; Thu, 20 Nov 2008 18:33:50 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.digg.com (Postfix) with ESMTP id AE6FDA851D1 for ; Thu, 20 Nov 2008 15:33:49 -0800 (PST) X-Virus-Scanned: amavisd-new at Received: from mail.digg.com ([127.0.0.1]) by localhost (mail.digg.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UQElnL7giuF2 for ; Thu, 20 Nov 2008 15:33:49 -0800 (PST) Received: from [10.2.16.90] (diggstage01.digg.com [64.191.203.34]) by mail.digg.com (Postfix) with ESMTP id 0E11AA85076 for ; Thu, 20 Nov 2008 15:33:48 -0800 (PST) Message-Id: <8A53EDCC-4050-4189-9162-D8186BF0E0B8@digg.com> From: Ian Eure To: emacs-pretest-bug@gnu.org Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Date: Thu, 20 Nov 2008 15:33:47 -0800 X-Mailer: Apple Mail (2.929.2) X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 3) It seems that some Unicode glyphs are incorrectly categorized. For example, U+201C, U+201D, U+2018, U+2019 (LEFT/RIGHT SINGLE/DOUBLE QUOTATION MARK) are all mapped into the CHK category. This results in the use of the STHeiti font for those characters, which are a different width than the normal font I've chosen. I think it's incorrect for them to be categorized as CJK, since they are widely used in latin scripts. In GNU Emacs 23.0.60.1 (i386-apple-darwin9.5.0, NS apple-appkit-949.35) of 2008-11-20 on neutron.local Windowing system distributor `Apple', version 97.112.112.108.101.45.97.112.112.107.105.116.45.57.52.57.46.51.53 configured using `configure '--with-ns'' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: nil value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: nil value of $XMODIFIERS: nil locale-coding-system: nil default-enable-multibyte-characters: t Major mode: Help Minor modes in effect: ime-bindings: t tooltip-mode: t mouse-wheel-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t global-auto-composition-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t view-mode: t Recent input: C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-x b * f o o C-x k RET C-x b f o C-a C-k C-p C-f C-f C-k C-a C-f C-a M-x d e s c r i b e - c h a r w C-_ C-x o C-n C-e C-b C-b C-b C-b C-c C-b C-b C-p C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b M-> C-p C-p C-p C-p C-p C-n C-n C-e C-a C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-n C-n C-n C-n C-n C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p M-p M-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-a C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-x b f o n t C-x 1 C-v C-v C-v M-v C-v C-v C-v M-v M-v M-v C-x b C-x b C-x b * H C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-n C-a C-SPC C-e M-w Recent messages: uncompressing fontset.el.gz...done call-interactively: Beginning of buffer Type C-x 1 to delete the help window. Undo! Mark set Auto-saving...done byte-code: Beginning of buffer [3 times] byte-code: End of buffer [6 times] byte-code: Beginning of buffer [7 times] Mark set From unknown Sat Aug 16 16:23:28 2025 X-Loop: don@donarmstrong.com Subject: bug#1399: 23.0.60; Some Unicode glyphs incorrectly mapped to CJK Reply-To: Chong Yidong , 1399@debbugs.gnu.org Resent-From: Chong Yidong Resent-To: bug-submit-list@lists.donarmstrong.com Resent-CC: Emacs Bugs Resent-Date: Sun, 30 Nov 2008 02:35:03 +0000 Resent-Message-ID: Resent-Sender: don@donarmstrong.com X-Emacs-PR-Message: report 1399 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Received: via spool by 1399-submit@emacsbugs.donarmstrong.com id=B1399.122801196815051 (code B ref 1399); Sun, 30 Nov 2008 02:35:03 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.3-bugs.debian.org_2005_01_02 (2007-08-08) on rzlab.ucr.edu X-Spam-Level: X-Spam-Status: No, score=-4.0 required=4.0 tests=AWL,BAYES_00,FOURLA autolearn=no version=3.2.3-bugs.debian.org_2005_01_02 Received: (at 1399) by emacsbugs.donarmstrong.com; 30 Nov 2008 02:26:08 +0000 Received: from cyd.mit.edu (CYD.MIT.EDU [18.115.2.24]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id mAU2Q51G015043 for <1399@emacsbugs.donarmstrong.com>; Sat, 29 Nov 2008 18:26:06 -0800 Received: by cyd.mit.edu (Postfix, from userid 1000) id BA53057E09E; Sat, 29 Nov 2008 21:26:05 -0500 (EST) From: Chong Yidong To: Kenichi Handa Cc: 1399@debbugs.gnu.org Date: Sat, 29 Nov 2008 21:26:05 -0500 Message-ID: <87oczyc5ky.fsf@cyd.mit.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Hi Handa-san, Could you take a look at this bug report? Ian Eure wrote: > It seems that some Unicode glyphs are incorrectly categorized. > > For example, U+201C, U+201D, U+2018, U+2019 (LEFT/RIGHT SINGLE/DOUBLE > QUOTATION MARK) are all mapped into the CHK category. This results in > the use of the STHeiti font for those characters, which are a > different width than the normal font I've chosen. > > I think it's incorrect for them to be categorized as CJK, since they > are widely used in latin scripts. From unknown Sat Aug 16 16:23:28 2025 X-Loop: owner@emacsbugs.donarmstrong.com Subject: bug#1399: 23.0.60; Some Unicode glyphs incorrectly mapped to CJK Reply-To: Kenichi Handa , 1399@debbugs.gnu.org Resent-From: Kenichi Handa Resent-To: bug-submit-list@lists.donarmstrong.com Resent-CC: Emacs Bugs Resent-Date: Tue, 17 Mar 2009 02:10:04 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-Emacs-PR-Message: followup 1399 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Received: via spool by 1399-submit@emacsbugs.donarmstrong.com id=B1399.123725521521865 (code B ref 1399); Tue, 17 Mar 2009 02:10:04 +0000 Received: (at 1399) by emacsbugs.donarmstrong.com; 17 Mar 2009 02:00:15 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=0.1 required=4.0 tests=FOURLA autolearn=no version=3.2.5-bugs.debian.org_2005_01_02 Received: from mx1.aist.go.jp (mx1.aist.go.jp [150.29.246.133]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id n2H20C33021594 for <1399@emacsbugs.donarmstrong.com>; Mon, 16 Mar 2009 19:00:13 -0700 Received: from rqsmtp2.aist.go.jp (rqsmtp2.aist.go.jp [150.29.254.123]) by mx1.aist.go.jp with ESMTP id n2H20A3x006950; Tue, 17 Mar 2009 11:00:10 +0900 (JST) env-from (handa@m17n.org) Received: from smtp4.aist.go.jp by rqsmtp2.aist.go.jp with ESMTP id n2H20AHs018567; Tue, 17 Mar 2009 11:00:10 +0900 (JST) env-from (handa@m17n.org) Received: by smtp4.aist.go.jp with ESMTP id n2H209KU024534; Tue, 17 Mar 2009 11:00:09 +0900 (JST) env-from (handa@m17n.org) Received: from handa by etlken with local (Exim 4.69) (envelope-from ) id 1LjObW-0004PZ-3G; Tue, 17 Mar 2009 11:00:30 +0900 From: Kenichi Handa To: 1399@debbugs.gnu.org CC: ian@digg.com Message-Id: Date: Tue, 17 Mar 2009 11:00:30 +0900 Sorry for the late response. > It seems that some Unicode glyphs are incorrectly categorized. > > For example, U+201C, U+201D, U+2018, U+2019 (LEFT/RIGHT SINGLE/DOUBLE > QUOTATION MARK) are all mapped into the CHK category. This results in > the use of the STHeiti font for those characters, which are a > different width than the normal font I've chosen. Category doesn't affect the font selection. As all of those characters are `symbol' script, Emacs at first lists fonts that have at least one of #x201C, #x2200, #x2500 (see script-representative-chars), and select one that matches best with your default font's family, foundry, etc. In your case, perhaps all your listed fonts have different family, foundry, etc than the default font, and thus Emacs selects arbitrary one from the listed fonts. Currently, Emacs can't know which kind of font is more suitable for those charaters; a font that has double-width glyphs for them, or a font that has single-width glyphs. So, if you prefer a specific font for symbol characters, you must modify the defualt fontset (or whatever fontset you are using) for symbol characters, for example, as this: (set-fontset-font "fontset-default" 'symbol '("FAMILYNAME" . "iso10646-1")) > I think it's incorrect for them to be categorized as CJK, since they > are widely used in latin scripts. Character category is not exclusive. Even if a character has CJK category, it doesn't mean that the character is not Latin. --- Kenichi Handa handa@m17n.org From cyd@stupidchicken.com Tue Mar 17 07:05:19 2009 Received: (at control) by emacsbugs.donarmstrong.com; 17 Mar 2009 14:05:19 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=-2.0 required=4.0 tests=VALID_BTS_CONTROL autolearn=ham version=3.2.5-bugs.debian.org_2005_01_02 Received: from cyd.mit.edu (CYD.MIT.EDU [18.115.2.24]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id n2HE5GpE013128 for ; Tue, 17 Mar 2009 07:05:17 -0700 Received: by cyd.mit.edu (Postfix, from userid 1000) id 96DCC57E21A; Tue, 17 Mar 2009 10:06:36 -0400 (EDT) From: Chong Yidong To: control@debbugs.gnu.org Subject: tag 1399 Date: Tue, 17 Mar 2009 10:06:36 -0400 Message-ID: <87ab7kp7kj.fsf@cyd.mit.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii tags 1399 wontfix, notabug thanks From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 02 12:05:27 2011 Received: (at control) by debbugs.gnu.org; 2 Aug 2011 16:05:27 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QoHTC-0002on-Jx for submit@debbugs.gnu.org; Tue, 02 Aug 2011 12:05:27 -0400 Received: from hermes.netfonds.no ([80.91.224.195]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1QoHTA-0002oe-QH for control@debbugs.gnu.org; Tue, 02 Aug 2011 12:05:25 -0400 Received: from cm-84.215.51.58.getinternet.no ([84.215.51.58] helo=stories.gnus.org) by hermes.netfonds.no with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1QoHSc-0004DA-FD for control@debbugs.gnu.org; Tue, 02 Aug 2011 18:04:50 +0200 Date: Tue, 02 Aug 2011 18:04:27 +0200 Message-Id: To: control@debbugs.gnu.org From: Lars Magne Ingebrigtsen Subject: control message for bug #1399 X-MailScanner-ID: 1QoHSc-0004DA-FD X-Netfonds-MailScanner: Found to be clean X-Netfonds-MailScanner-From: larsi@gnus.org MailScanner-NULL-Check: 1312905890.56769@F2VFnP1Z6+rN8kSku/FjFg X-Spam-Status: No X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) close 1399