From debbugs-submit-bounces@debbugs.gnu.org Tue Feb 18 08:51:08 2020 Received: (at submit) by debbugs.gnu.org; 18 Feb 2020 13:51:08 +0000 Received: from localhost ([127.0.0.1]:39693 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j43HH-0008KR-NK for submit@debbugs.gnu.org; Tue, 18 Feb 2020 08:51:07 -0500 Received: from lists.gnu.org ([209.51.188.17]:34309) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j43HG-0008KK-On for submit@debbugs.gnu.org; Tue, 18 Feb 2020 08:51:07 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:49750) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j43HF-0003Oa-Dj for bug-gnu-emacs@gnu.org; Tue, 18 Feb 2020 08:51:06 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j43HE-0002Tl-7o for bug-gnu-emacs@gnu.org; Tue, 18 Feb 2020 08:51:05 -0500 Received: from mail-pl1-x62f.google.com ([2607:f8b0:4864:20::62f]:42158) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1j43HE-0002TJ-0m for bug-gnu-emacs@gnu.org; Tue, 18 Feb 2020 08:51:04 -0500 Received: by mail-pl1-x62f.google.com with SMTP id e8so8100297plt.9 for ; Tue, 18 Feb 2020 05:51:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id:mime-version; bh=GQiRpTT17vs77cWep8wU3YrsYITSnKzJyZ//p4NIsHc=; b=DDbZfF8pSj2fqfCuSxd7FtKh4wiO/zl/LuE7EREtIv6mghN+efIFM8KrFcGi40lEjt pWm6hMiZYYQ4Egbpu6BqpXDqhadbMrz/HyFQhhZISOAgt9nRauU3yA5Mb8iQamBabVWP ixwYJq+AA857x8R50Z2S4GeJIdtc7XGAsaCl2k2B1HVGv2oSAYJt9MWdKa3ZQbj4LVzX 1W9UQVa+ls2OjrLeuBaqdSj8pvj0a6AKUGNQvudWoztEJK1GuXOfqGpAYjKMDPVKaF1B aNuE/FaKRzsciJGGgGW+QhIW20/UJkQmFPfJG7ECohp9ZtAevcjTbY4591DbEd7OPBwa YwJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:mime-version; bh=GQiRpTT17vs77cWep8wU3YrsYITSnKzJyZ//p4NIsHc=; b=Y7Xob5iN9NAYENvj38qBxkPn1uJ5EtngyGHBZazC4RbMNKcFLaVOZsU0vgMiTm1C3A vK0TuYpUfKi0YH7lbpIzb4L+5NFuzvIUjs//60GxX6Y7aNJN4GFbBviSlKI/hzmibKZa hGLSjELP7bvohfh2AAkMHILD4V6XFEFg6oVnuGo8gWaiUyRcp8NrZO4MyVQkfCD/nqtb HENL1Dh5ZugoULRZ+2A8l0l2TXHsGBtI75853XYq2KAU44CD6QMxQky6yR2sDDKznH7J TQtowliUGAG+vmdiy10UsRgAzdJa6D9t/6no1F8iRDgWwRfuhxno35KiUj1+dHO5FeCD MJAw== X-Gm-Message-State: APjAAAWv6sQp77gDQ52dkvmiAFUz4tQQJqqN459xVxrZJbSmYZgYpejD b12AIUSn2++J9gKsflXVjj/u3M9d X-Google-Smtp-Source: APXvYqzHp/q+nBVRSqsCL5FNLR0oYJStgHaDnpeUXBm4RLSMDFP1YvhZcX0xIWdwBynpdyUr22jlag== X-Received: by 2002:a17:902:d683:: with SMTP id v3mr21246099ply.134.1582033862655; Tue, 18 Feb 2020 05:51:02 -0800 (PST) Received: from HP (east42-p122.eaccess.hi-ho.ne.jp. [219.121.173.123]) by smtp.gmail.com with ESMTPSA id r145sm4589259pfr.5.2020.02.18.05.51.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Feb 2020 05:51:01 -0800 (PST) From: ynyaaa@gmail.com To: bug-gnu-emacs@gnu.org Subject: 27.0.60; inappropriate han script definition in char-script-table Date: Tue, 18 Feb 2020 22:50:57 +0900 Message-ID: <86mu9g6m2m.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::62f X-Spam-Score: 2.3 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 'han' script is defined in char-script-table as: 2E80-2FDF han 3200-9FFF han F900-FAFF han FE30-FE4F han 1F200-1F2FF han 20000-2A6DF han 2A700-2EBEF han 2F800-2FA1F han It is better to set values as: 3200-33FF cjk-misc 4DC0-4DFF cjk-misc FE30-FE4F cjk-misc 1F200-1F2FF cjk-misc Content analysis details: (2.3 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at https://www.dnswl.org/, low trust [209.51.188.17 listed in list.dnswl.org] 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (ynyaaa[at]gmail.com) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 2.0 SPOOFED_FREEMAIL No description available. X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) 'han' script is defined in char-script-table as: 2E80-2FDF han 3200-9FFF han F900-FAFF han FE30-FE4F han 1F200-1F2FF han 20000-2A6DF han 2A700-2EBEF han 2F800-2FA1F han It is better to set values as: 3200-33FF cjk-misc 4DC0-4DFF cjk-misc FE30-FE4F cjk-misc 1F200-1F2FF cjk-misc If enclosed CJK Ideographs should be 'han' script, enclosed Hanguls should be 'hangul' script, enclosed Katakana should be 'kana' script, and enclosed Numbers should be 'symbol' script. In GNU Emacs 27.0.60 (build 1, x86_64-w64-mingw32) of 2019-12-29 built on CIRROCUMULUS Repository revision: 21c3020fcec0a32122d2680a391864a75393031b Repository branch: emacs-27 Windowing system distributor 'Microsoft Corp.', version 10.0.18363 System Description: Microsoft Windows 10 Pro (v10.0.1909.18363.657) Recent messages: Configured using: 'configure --without-dbus --host=x86_64-w64-mingw32 --without-compress-install -C 'CFLAGS=-O2 -static -g3'' Configured features: XPM JPEG TIFF GIF PNG RSVG SOUND NOTIFY W32NOTIFY ACL GNUTLS LIBXML2 HARFBUZZ ZLIB TOOLKIT_SCROLL_BARS MODULES THREADS PDUMPER LCMS2 GMP Important settings: value of $LANG: JPN locale-coding-system: cp932 Major mode: Lisp Interaction Minor modes in effect: tooltip-mode: t global-eldoc-mode: t eldoc-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t Load-path shadows: None found. Features: (rect wid-edit descr-text mule-diag thingatpt cl-extra novice help-fns radix-tree cl-print debug backtrace find-func gnutls network-stream nsm mailalias smtpmail auth-source cl-seq eieio eieio-core cl-macs eieio-loaddefs json map misearch multi-isearch help-mode pp shadow sort mail-extr term/bobcat emacsbug message rmc puny dired dired-loaddefs format-spec rfc822 mml easymenu mml-sec password-cache epa derived epg epg-config gnus-util rmail rmail-loaddefs text-property-search time-date subr-x seq byte-opt gv bytecomp byte-compile cconv mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader cl-loaddefs cl-lib sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils japan-util tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel dos-w32 ls-lisp disp-table term/w32-win w32-win w32-vars term/common-win tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode elisp-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core term/tty-colors frame minibuffer cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite charscript charprop case-table epa-hook jka-cmpr-hook help simple abbrev obarray cl-preloaded nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote threads w32notify w32 lcms2 multi-tty make-network-process emacs) Memory information: ((conses 16 921325 302508) (symbols 48 58666 0) (strings 32 119118 12074) (string-bytes 1 2586170) (vectors 16 88868) (vector-slots 8 2545555 209000) (floats 8 47 281) (intervals 56 44668 5857) (buffers 1000 22)) From debbugs-submit-bounces@debbugs.gnu.org Tue Feb 18 11:02:50 2020 Received: (at 39659) by debbugs.gnu.org; 18 Feb 2020 16:02:50 +0000 Received: from localhost ([127.0.0.1]:41169 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j45Kj-0003Tr-OU for submit@debbugs.gnu.org; Tue, 18 Feb 2020 11:02:49 -0500 Received: from eggs.gnu.org ([209.51.188.92]:54913) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j45Kh-0003Tf-Li for 39659@debbugs.gnu.org; Tue, 18 Feb 2020 11:02:47 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:48825) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1j45Kc-0005ZC-Gi; Tue, 18 Feb 2020 11:02:42 -0500 Received: from [176.228.60.248] (port=2336 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1j45Kb-0001tK-Qj; Tue, 18 Feb 2020 11:02:42 -0500 Date: Tue, 18 Feb 2020 18:02:37 +0200 Message-Id: <83y2szlw82.fsf@gnu.org> From: Eli Zaretskii To: ynyaaa@gmail.com, Kenichi Handa In-reply-to: <86mu9g6m2m.fsf@gmail.com> (ynyaaa@gmail.com) Subject: Re: bug#39659: 27.0.60; inappropriate han script definition in char-script-table References: <86mu9g6m2m.fsf@gmail.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 39659 Cc: 39659@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) > From: ynyaaa@gmail.com > Date: Tue, 18 Feb 2020 22:50:57 +0900 > > 'han' script is defined in char-script-table as: > 2E80-2FDF han > 3200-9FFF han > F900-FAFF han > FE30-FE4F han > 1F200-1F2FF han > 20000-2A6DF han > 2A700-2EBEF han > 2F800-2FA1F han > > It is better to set values as: > 3200-33FF cjk-misc > 4DC0-4DFF cjk-misc > FE30-FE4F cjk-misc > 1F200-1F2FF cjk-misc > > If enclosed CJK Ideographs should be 'han' script, > enclosed Hanguls should be 'hangul' script, > enclosed Katakana should be 'kana' script, > and enclosed Numbers should be 'symbol' script. Please provide some rationale for the differences, just saying "better" and "should" doesn't explain why you think the changes are for the good. CC'ing Handa-san, who I hope will have some comments on this. Thanks. From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 19 04:53:25 2020 Received: (at 39659) by debbugs.gnu.org; 19 Feb 2020 09:53:25 +0000 Received: from localhost ([127.0.0.1]:41653 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j4M2m-0006bn-Qh for submit@debbugs.gnu.org; Wed, 19 Feb 2020 04:53:25 -0500 Received: from mail-pj1-f41.google.com ([209.85.216.41]:38775) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j4M2l-0006bW-6M for 39659@debbugs.gnu.org; Wed, 19 Feb 2020 04:53:23 -0500 Received: by mail-pj1-f41.google.com with SMTP id j17so2317560pjz.3 for <39659@debbugs.gnu.org>; Wed, 19 Feb 2020 01:53:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:in-reply-to:date:message-id:mime-version; bh=93bYS36mj0B0T0J3rLstsSlikSFEDrfdcoKLi+9bWpA=; b=T+Lf+Z10qI0dcTke0WP+Zino96M6Y2wtZRYNWBEUPfc3D/vQpoPARLiCbDiw+/1SlZ 7X6ZYc4+3E3c0vMDonpopviP5HxKWgiGQNyeknR9y4i9wTJ3AlRJ+5Eawd8lKocC2tTN DtechBYK1yvcZ3AkHSQXJFmZc8MGZ251K19gebKrq/edT1A/s1AduEtVoKatBg7qC5ZX d7KpJ0bI9Wk8yqVmaliuxFVlGtE44pueAq2R8k2whAjV3CrDq/VkP+e2zfmbXaceYLr9 wGjNQ/9azx5bpGfYMD/mJmit0In2F00ez/LyGtTtXYP6pasuo9aDznlt8uOhVTtHnvL8 EGkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:date:message-id :mime-version; bh=93bYS36mj0B0T0J3rLstsSlikSFEDrfdcoKLi+9bWpA=; b=LUU30+Q9YypH6+6+N0INauy3TTJAHpAB7/nO2dfPcpb0jJYO/ciULwr33FH3eQrQi8 matIXiadRdMOax/ezOG5Wkw08JHvg3Ru9oe5oPpiSuWQbnsxczULJFwUMk/tvYX9AyHb L+3WKbRHtD53gTupCcJZVlT5RaaEK6TN10YhDlu9Hu26y6DzyZdMiC4O/uS8YXpHrKNW oDPe4mVcdznQ8qjKPgbX3cc7XxoGTXzHsaLzC3nM+E3yeubfu+LN/XW6eSOiL9GjoanU VdJ1iztFYNCy/xHdGl9Rbvzk5bAZVCuh/IZX/bEth1lwo5yBaLTyps/qg3SHN0nLiVJe OioQ== X-Gm-Message-State: APjAAAXJ/sDmj1dEaij7mO/MhngtZ3WPJ8XiB1bzG7axsSI2zDXeKwMd gq2Byi5kl234go6+Q2t2078sIXu7 X-Google-Smtp-Source: APXvYqznbrOqWWHksO2brHoZeG6e37PkCwH0xOVK69uw/wZpmTYLd5Iz8f49N8RM5sUUhAdzjS//Pg== X-Received: by 2002:a17:90a:5d88:: with SMTP id t8mr7821390pji.120.1582105996940; Wed, 19 Feb 2020 01:53:16 -0800 (PST) Received: from HP (east42-p122.eaccess.hi-ho.ne.jp. [219.121.173.123]) by smtp.gmail.com with ESMTPSA id 144sm2281999pfc.45.2020.02.19.01.53.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Feb 2020 01:53:16 -0800 (PST) From: ynyaaa@gmail.com To: Eli Zaretskii Subject: Re: bug#39659: 27.0.60; inappropriate han script definition in char-script-table In-Reply-To: <83y2szlw82.fsf@gnu.org> (Eli Zaretskii's message of "Tue, 18 Feb 2020 18:02:37 +0200") Date: Wed, 19 Feb 2020 18:53:07 +0900 Message-ID: <86y2syj43g.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 39659 Cc: Kenichi Handa , 39659@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Eli Zaretskii writes: >> From: ynyaaa@gmail.com >> Date: Tue, 18 Feb 2020 22:50:57 +0900 >> >> 'han' script is defined in char-script-table as: >> 2E80-2FDF han >> 3200-9FFF han >> F900-FAFF han >> FE30-FE4F han >> 1F200-1F2FF han >> 20000-2A6DF han >> 2A700-2EBEF han >> 2F800-2FA1F han >> >> It is better to set values as: >> 3200-33FF cjk-misc >> 4DC0-4DFF cjk-misc >> FE30-FE4F cjk-misc >> 1F200-1F2FF cjk-misc >> >> If enclosed CJK Ideographs should be 'han' script, >> enclosed Hanguls should be 'hangul' script, >> enclosed Katakana should be 'kana' script, >> and enclosed Numbers should be 'symbol' script. > > Please provide some rationale for the differences, just saying > "better" and "should" doesn't explain why you think the changes are > for the good. > > CC'ing Handa-san, who I hope will have some comments on this. > > Thanks. Because they are not han characters. I think that combinatorial characters are not han characters, and that they are symbolic characters. As for enclosed latin letters, they are treated as 'symbol' script. 249C-24B5 PARENTHESIZED LATIN SMALL LETTER * 24B6-24CF CIRCLED LATIN CAPITAL LETTER * 24D0-24E9 CIRCLED LATIN SMALL LETTER * 1F110-1F129 PARENTHESIZED LATIN CAPITAL LETTER * 1F130-1F149 SQUARED LATIN CAPITAL LETTER * 1F150-1F169 NEGATIVE CIRCLED LATIN CAPITAL LETTER * 1F170-1F189 NEGATIVE SQUARED LATIN CAPITAL LETTER * 1F12A TORTOISE SHELL BRACKETED LATIN CAPITAL LETTER S 1F12B CIRCLED ITALIC LATIN CAPITAL LETTER C 1F12C CIRCLED ITALIC LATIN CAPITAL LETTER R 1F18A CROSSED NEGATIVE SQUARED LATIN CAPITAL LETTER P 1F1A5 SQUARED LATIN SMALL LETTER D If script is set to han, hangul or kana for combinatorial characters which contain han, hangul or kana characters, script values are like below: CodePoint Script Comment 3200-321E hangul enclosed hangul 321F - unassigned 3220-3247 han enclosed han 3248-324F symbol enclosed number 3250 symbol combined latin 3251-325F symbol enclosed number 3260-327E hangul enclosed hangul 327F symbol symbol 3280-32B0 han enclosed han 32B1-32BF symbol enclosed number 32C0-32CB han square character with han 32CC-32CF symbol square character with latin 32D0-32FE kana enclosed kana 32FF han square character with han 3300-3357 kana square character with kana 3358-3370 han square character with han 3371-337A symbol square character with latin 337B-337F han square character with han 3380-33DF symbol square character with latin 33E0-33FE han square character with han 33FF symbol square character with latin 4DC0-4DFF symbol symbol FE30-FE44 symbol symbol for vertical FE45-FE46 symbol symbol FE47-FE48 symbol symbol for vertical FE49-FE4F symbol symbol 1F200-1F202 kana enclosed/square character with kana ... - unassigned 1F210-1F212 han enclosed han 1F213 kana enclosed kana 1F214-1F248 han enclosed han ... - unassigned 1F250-1F251 han enclosed han ... - unassigned 1F260-1F265 symbol symbol From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 19 10:43:26 2020 Received: (at 39659) by debbugs.gnu.org; 19 Feb 2020 15:43:27 +0000 Received: from localhost ([127.0.0.1]:43276 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j4RVW-0004ZM-Me for submit@debbugs.gnu.org; Wed, 19 Feb 2020 10:43:26 -0500 Received: from eggs.gnu.org ([209.51.188.92]:46321) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j4RVV-0004ZA-DD for 39659@debbugs.gnu.org; Wed, 19 Feb 2020 10:43:25 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:39870) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1j4RVQ-0004XA-8I; Wed, 19 Feb 2020 10:43:20 -0500 Received: from [176.228.60.248] (port=1228 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1j4RVP-0005Ny-65; Wed, 19 Feb 2020 10:43:19 -0500 Date: Wed, 19 Feb 2020 17:43:02 +0200 Message-Id: <83eeuqlh15.fsf@gnu.org> From: Eli Zaretskii To: ynyaaa@gmail.com In-reply-to: <86y2syj43g.fsf@gmail.com> (ynyaaa@gmail.com) Subject: Re: bug#39659: 27.0.60; inappropriate han script definition in char-script-table References: <86y2syj43g.fsf@gmail.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 39659 Cc: handa@gnu.org, 39659@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.5 (/) > From: ynyaaa@gmail.com > Cc: Kenichi Handa , 39659@debbugs.gnu.org > Date: Wed, 19 Feb 2020 18:53:07 +0900 > > >> It is better to set values as: > >> 3200-33FF cjk-misc > >> 4DC0-4DFF cjk-misc > >> FE30-FE4F cjk-misc > >> 1F200-1F2FF cjk-misc > >> > >> If enclosed CJK Ideographs should be 'han' script, > >> enclosed Hanguls should be 'hangul' script, > >> enclosed Katakana should be 'kana' script, > >> and enclosed Numbers should be 'symbol' script. > > > > Please provide some rationale for the differences, just saying > > "better" and "should" doesn't explain why you think the changes are > > for the good. > > > > CC'ing Handa-san, who I hope will have some comments on this. > > > > Thanks. > > Because they are not han characters. > I think that combinatorial characters are not han characters, > and that they are symbolic characters. So your interpretation of cjk-misc is that they are symbols, not letters? I'm asking because I don't really know what is meant by "cjk-misc", I don't think we have it documented anywhere. From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 20 01:28:04 2020 Received: (at 39659) by debbugs.gnu.org; 20 Feb 2020 06:28:04 +0000 Received: from localhost ([127.0.0.1]:43735 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j4fJc-0000Sc-2i for submit@debbugs.gnu.org; Thu, 20 Feb 2020 01:28:04 -0500 Received: from mail-pj1-f43.google.com ([209.85.216.43]:36487) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j4fJZ-0000S5-Qv for 39659@debbugs.gnu.org; Thu, 20 Feb 2020 01:28:02 -0500 Received: by mail-pj1-f43.google.com with SMTP id gv17so438919pjb.1 for <39659@debbugs.gnu.org>; Wed, 19 Feb 2020 22:28:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:in-reply-to:date:message-id:mime-version; bh=QLIWBR9Rcfbt6glzA0UXwEXG30cudDy8OXxYSC4TzBs=; b=to9NfIHJ0rIXPuGpcUuvLDoGh7FuGx5990K2F1lraBGcQt2g5Rl/Yvvu0ByADmNcyo zlVBK77U8rkOhyBJxJM5UY1/vl7TOE7YUg/iaNvZOvnSJ3RBSlcjxpfUsaq5Pky1V/em XyMCpKo6HUN/toK51/BLmqhIHsUYO2kXWnEumLr5apAMG3UZpwpYC00YKKQkVmK/YOF4 BfWdjPe4xW24x/nrXD+McvlSb3WX72L6HfTf96gCBCbhanywmJvqwazLug8uPDyo6ld7 5eCf3+sG4PYfQ4+ZGdZ/7PYRQDOzMdxtDJYYm98B+kvqzBI0Ezj6FyN67BmWcWbgQ+4h k46w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:date:message-id :mime-version; bh=QLIWBR9Rcfbt6glzA0UXwEXG30cudDy8OXxYSC4TzBs=; b=Enh/IHGnZbev+B9shgHHKv+1QOXEiRz/IlVWDpvGgmoHGskMSO5b4NZRQhZNSmANYd cgx5IAS/nEohpEX1MyiTmSFCnTOmc76EDA5/B5E2ARAUmi0Y3zyoAd7HDHhy06x4rW4I IZkZo7TVq/jAjE4wMasaBrmLN+vBUtIbtknEp97mkdHTVYm/ETSBcOW2bg84dbsgXl7V MaKCugMZR4q3e5lJ2w3rYq/AzXMSQTMEV44qVJKAElgXQMOLHzB4FVFkBzOxzmhnTk6o zk02Bt37VNcvzT1PvbjYOToBEETSCPuLeyVky6umAzHr+jfPY/JS7sVQfvLpPSJkE65f s5BQ== X-Gm-Message-State: APjAAAVaixC7A415MBIEkRnr+5kPTWndkTytD0guZr78ij/6vwyG5LrN Qsf0qcItUZwnGGHgLUaIOiyq2b11 X-Google-Smtp-Source: APXvYqymj1RRiRmbq66G/J+yUsLlgloEdqci+3324lPfA9Tab2IyXSyViLyBGTMj2b4+UoWPiLE8aw== X-Received: by 2002:a17:90b:4004:: with SMTP id ie4mr1771867pjb.49.1582180075702; Wed, 19 Feb 2020 22:27:55 -0800 (PST) Received: from HP (east42-p122.eaccess.hi-ho.ne.jp. [219.121.173.123]) by smtp.gmail.com with ESMTPSA id m18sm1810210pgd.39.2020.02.19.22.27.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Feb 2020 22:27:54 -0800 (PST) From: ynyaaa@gmail.com To: Eli Zaretskii Subject: Re: bug#39659: 27.0.60; inappropriate han script definition in char-script-table In-Reply-To: <83eeuqlh15.fsf@gnu.org> (Eli Zaretskii's message of "Wed, 19 Feb 2020 17:43:02 +0200") Date: Thu, 20 Feb 2020 15:27:47 +0900 Message-ID: <86k14hu61o.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 39659 Cc: handa@gnu.org, 39659@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Eli Zaretskii writes: >> From: ynyaaa@gmail.com >> Cc: Kenichi Handa , 39659@debbugs.gnu.org >> Date: Wed, 19 Feb 2020 18:53:07 +0900 >> >> >> It is better to set values as: >> >> 3200-33FF cjk-misc >> >> 4DC0-4DFF cjk-misc >> >> FE30-FE4F cjk-misc >> >> 1F200-1F2FF cjk-misc >> >> >> >> If enclosed CJK Ideographs should be 'han' script, >> >> enclosed Hanguls should be 'hangul' script, >> >> enclosed Katakana should be 'kana' script, >> >> and enclosed Numbers should be 'symbol' script. >> > >> > Please provide some rationale for the differences, just saying >> > "better" and "should" doesn't explain why you think the changes are >> > for the good. >> > >> > CC'ing Handa-san, who I hope will have some comments on this. >> > >> > Thanks. >> >> Because they are not han characters. >> I think that combinatorial characters are not han characters, >> and that they are symbolic characters. > > So your interpretation of cjk-misc is that they are symbols, not > letters? I'm asking because I don't really know what is meant by > "cjk-misc", I don't think we have it documented anywhere. I guess the cjk-misc script means CJK related characters. Block names in the Unicode Character Database are described as below. (https://www.unicode.org/Public/UCD/latest/ucd/Blocks.txt) 3000..303F; CJK Symbols and Punctuation 31C0..31EF; CJK Strokes 3200..32FF; Enclosed CJK Letters and Months 3300..33FF; CJK Compatibility 4DC0..4DFF; Yijing Hexagram Symbols FE30..FE4F; CJK Compatibility Forms 1F200..1F2FF; Enclosed Ideographic Supplement Yijing Hexagram Symbols(U+4DC0..U+4DFF) are chinese symbols related with 2630-2637 TRIGRAM FOR * 268A-268B MONOGRAM FOR * 268C-268F DIGRAM FOR * 1D300-1D35F Tai Xuan Jing Symbols The script symbol for "Yijing Hexagram Symbols" may be 'symbol or 'yijing-hexagram-symbol. From debbugs-submit-bounces@debbugs.gnu.org Fri Feb 28 22:39:48 2020 Received: (at 39659) by debbugs.gnu.org; 29 Feb 2020 03:39:48 +0000 Received: from localhost ([127.0.0.1]:34067 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j7syh-0008Fo-SV for submit@debbugs.gnu.org; Fri, 28 Feb 2020 22:39:48 -0500 Received: from eggs.gnu.org ([209.51.188.92]:33692) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j7syc-0008FP-OM for 39659@debbugs.gnu.org; Fri, 28 Feb 2020 22:39:46 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:54690) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1j7syX-0004Qg-JQ; Fri, 28 Feb 2020 22:39:37 -0500 Received: from fl1-60-236-80-213.iba.mesh.ad.jp ([60.236.80.213]:54385 helo=shatin) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1j7syW-0002NC-GQ; Fri, 28 Feb 2020 22:39:36 -0500 Received: from handa by shatin with local (Exim 4.90_1) (envelope-from ) id 1j7syQ-0008yu-N2; Sat, 29 Feb 2020 12:39:30 +0900 From: handa To: ynyaaa@gmail.com Subject: Re: bug#39659: 27.0.60; inappropriate han script definition in char-script-table In-Reply-To: <86y2syj43g.fsf@gmail.com> (ynyaaa@gmail.com) Date: Sat, 29 Feb 2020 12:39:30 +0900 Message-ID: <877e06uknh.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 39659 Cc: eliz@gnu.org, 39659@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) In article <86y2syj43g.fsf@gmail.com>, ynyaaa@gmail.com writes: >>> From: ynyaaa@gmail.com >>> Date: Tue, 18 Feb 2020 22:50:57 +0900 >>> >>> 'han' script is defined in char-script-table as: >>> 2E80-2FDF han >>> 3200-9FFF han >>> F900-FAFF han >>> FE30-FE4F han >>> 1F200-1F2FF han >>> 20000-2A6DF han >>> 2A700-2EBEF han >>> 2F800-2FA1F han >>> >>> It is better to set values as: >>> 3200-33FF cjk-misc >>> 4DC0-4DFF cjk-misc >>> FE30-FE4F cjk-misc >>> 1F200-1F2FF cjk-misc The script names were at first assigned to help fontset.el which sets up the default fontset by using script names in defining font specs (for CHARSTE_REGISTRY of X fonts or "script" of OpenType fonts). So there was no precise semantics. I think it is ok to change/fix char-script-table to improve some behavior of Emacs without breaking fontset.el. --- K. Handa handa@gnu.org From debbugs-submit-bounces@debbugs.gnu.org Sat Feb 29 02:35:02 2020 Received: (at 39659) by debbugs.gnu.org; 29 Feb 2020 07:35:02 +0000 Received: from localhost ([127.0.0.1]:34165 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j7weM-0003J5-0O for submit@debbugs.gnu.org; Sat, 29 Feb 2020 02:35:02 -0500 Received: from eggs.gnu.org ([209.51.188.92]:47511) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1j7weK-0003If-1Z for 39659@debbugs.gnu.org; Sat, 29 Feb 2020 02:35:01 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:57687) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1j7weE-0001Nm-Sb; Sat, 29 Feb 2020 02:34:54 -0500 Received: from [176.228.60.248] (port=4124 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1j7weA-0000rK-Kg; Sat, 29 Feb 2020 02:34:54 -0500 Date: Sat, 29 Feb 2020 09:34:39 +0200 Message-Id: <83sgitetio.fsf@gnu.org> From: Eli Zaretskii To: handa In-reply-to: <877e06uknh.fsf@gnu.org> (message from handa on Sat, 29 Feb 2020 12:39:30 +0900) Subject: Re: bug#39659: 27.0.60; inappropriate han script definition in char-script-table References: <877e06uknh.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 39659 Cc: ynyaaa@gmail.com, 39659@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.5 (/) > From: handa > Cc: eliz@gnu.org, 39659@debbugs.gnu.org > Date: Sat, 29 Feb 2020 12:39:30 +0900 > > In article <86y2syj43g.fsf@gmail.com>, ynyaaa@gmail.com writes: > >>> From: ynyaaa@gmail.com > >>> Date: Tue, 18 Feb 2020 22:50:57 +0900 > >>> > >>> 'han' script is defined in char-script-table as: > >>> 2E80-2FDF han > >>> 3200-9FFF han > >>> F900-FAFF han > >>> FE30-FE4F han > >>> 1F200-1F2FF han > >>> 20000-2A6DF han > >>> 2A700-2EBEF han > >>> 2F800-2FA1F han > >>> > >>> It is better to set values as: > >>> 3200-33FF cjk-misc > >>> 4DC0-4DFF cjk-misc > >>> FE30-FE4F cjk-misc > >>> 1F200-1F2FF cjk-misc > > The script names were at first assigned to help fontset.el which sets up > the default fontset by using script names in defining font specs (for > CHARSTE_REGISTRY of X fonts or "script" of OpenType fonts). So there > was no precise semantics. OK, but would you agree that the latter group of character blocks, i.e. 3200-33FF 4DC0-4DFF FE30-FE4F 1F200-1F2FF should be in the cjk-misc category? Or, to phrase this differently: why was cjk-misc created in the first place, since the only difference between it and han in the default fontset seems to be this single element: (nil . "JISX0213.2004-1") which is present for the han script, but absent for cjk-misc. I don't think I see where the CHARSET_REGISTRY of X or "script" of OpenType fonts come into play, when distinguishing between han and cjk-misc is concerned. > I think it is ok to change/fix char-script-table to improve some > behavior of Emacs without breaking fontset.el. Can you elaborate about this? I don't think I understand which fixes you had in mind, and how they could or could not break fontset.el. Thanks. From debbugs-submit-bounces@debbugs.gnu.org Sat Mar 07 20:13:53 2020 Received: (at 39659) by debbugs.gnu.org; 8 Mar 2020 01:13:53 +0000 Received: from localhost ([127.0.0.1]:47794 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jAkVs-0004So-ON for submit@debbugs.gnu.org; Sat, 07 Mar 2020 20:13:53 -0500 Received: from eggs.gnu.org ([209.51.188.92]:41254) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jAkVr-0004Sc-41 for 39659@debbugs.gnu.org; Sat, 07 Mar 2020 20:13:51 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:46883) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1jAkVm-0006Mj-11; Sat, 07 Mar 2020 20:13:46 -0500 Received: from fl1-60-236-80-213.iba.mesh.ad.jp ([60.236.80.213]:52943 helo=shatin) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jAkVl-0007c0-2x; Sat, 07 Mar 2020 20:13:45 -0500 Received: from handa by shatin with local (Exim 4.90_1) (envelope-from ) id 1jAkVh-0002nc-GA; Sun, 08 Mar 2020 10:13:41 +0900 From: handa To: Eli Zaretskii Subject: Re: bug#39659: 27.0.60; inappropriate han script definition in char-script-table In-Reply-To: <83sgitetio.fsf@gnu.org> (message from Eli Zaretskii on Sat, 29 Feb 2020 09:34:39 +0200) Date: Sun, 08 Mar 2020 10:13:41 +0900 Message-ID: <87a74rk5ru.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 39659 Cc: handa@gnu.org, ynyaaa@gmail.com, 39659@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) In article <83sgitetio.fsf@gnu.org>, Eli Zaretskii writes: > > The script names were at first assigned to help fontset.el which sets up > > the default fontset by using script names in defining font specs (for > > CHARSTE_REGISTRY of X fonts or "script" of OpenType fonts). So there > > was no precise semantics. > OK, but would you agree that the latter group of character blocks, > i.e. > 3200-33FF > 4DC0-4DFF > FE30-FE4F > 1F200-1F2FF > should be in the cjk-misc category? Or, to phrase this differently: > why was cjk-misc created in the first place, When I defined them, it was a transion period of font-related environment. Af far as I remmeber, cjk-misc was introduced later for fonts that covers characters used in CJK environment but not yet covered by legacy CJK X fonts (JISX0208, JISX0212, GB2312, KSC5601). > since the only difference between it and han in the default fontset > seems to be this single element: > (nil . "JISX0213.2004-1") > which is present for the han script, but absent for cjk-misc. The definition of the default fontset had been changed frequently on the change of the font-related environment. Perhaps the current setting must be re-considered based on the current font-related environment. > > I think it is ok to change/fix char-script-table to improve some > > behavior of Emacs without breaking fontset.el. > Can you elaborate about this? I don't think I understand which fixes > you had in mind, and how they could or could not break fontset.el. As I don't know the tendency of the recent font-related environment, I can not suggest how to fix the current setting. All I can say is that , when someone changes char-script-table, he should also check how script is used for the definition of the default fontset. --- K. Handa handa@gnu.org