From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: Nathan Trapuzzano Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 28 Mar 2014 12:08:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 17130@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.13960084808350 (code B ref -1); Fri, 28 Mar 2014 12:08:02 +0000 Received: (at submit) by debbugs.gnu.org; 28 Mar 2014 12:08:00 +0000 Received: from localhost ([127.0.0.1]:54007 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTVZn-0002Ac-OV for submit@debbugs.gnu.org; Fri, 28 Mar 2014 08:08:00 -0400 Received: from eggs.gnu.org ([208.118.235.92]:42064) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTVZk-0002AQ-AD for submit@debbugs.gnu.org; Fri, 28 Mar 2014 08:07:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WTVZc-0002t7-09 for submit@debbugs.gnu.org; Fri, 28 Mar 2014 08:07:56 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:46917) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WTVZb-0002t3-Sn for submit@debbugs.gnu.org; Fri, 28 Mar 2014 08:07:47 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43705) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WTVZU-0007L3-K0 for bug-gnu-emacs@gnu.org; Fri, 28 Mar 2014 08:07:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WTVZN-0002Z2-Jn for bug-gnu-emacs@gnu.org; Fri, 28 Mar 2014 08:07:40 -0400 Received: from gproxy1-pub.mail.unifiedlayer.com ([69.89.25.95]:51234) by eggs.gnu.org with smtp (Exim 4.71) (envelope-from ) id 1WTVZN-0002X4-8O for bug-gnu-emacs@gnu.org; Fri, 28 Mar 2014 08:07:33 -0400 Received: (qmail 14625 invoked by uid 0); 28 Mar 2014 12:07:29 -0000 Received: from unknown (HELO cmgw3) (10.0.90.84) by gproxy1.mail.unifiedlayer.com with SMTP; 28 Mar 2014 12:07:29 -0000 Received: from host393.hostmonster.com ([66.147.240.193]) by cmgw3 with id j77N1n00g4B3kjm0177RvD; Fri, 28 Mar 2014 13:07:28 -0600 X-Authority-Analysis: v=2.1 cv=O5+q4nNW c=1 sm=1 tr=0 a=GZ6qK+eS4AuCRVUKGEKC+Q==:117 a=GZ6qK+eS4AuCRVUKGEKC+Q==:17 a=DsvgjBjRAAAA:8 a=f5113yIGAAAA:8 a=4GsTxW34auoA:10 a=m0TKECSmYh0A:10 a=lfvU_ReahkwA:10 a=IkcTkHD0fZMA:10 a=ngU5ixn2AAAA:8 a=fWyWhr6xdMwA:10 a=KHupFdB8kqetKUVLglEA:9 a=QEXdDO2ut3YA:10 a=c8EEdWjqUw4A:10 a=o18yhDzoSnsA:10 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nbtrap.com; s=default; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID:Date:Subject:To:From; bh=l7NCnJpiPlHGrf2KlfOeuua2HFDB1+CX4ZrXeglRu88=; b=kYQkhnaYglH+GjeIrqXDMrOHCgM4gq0ca9wTg+qOdv9ei7Yslj8WFJ1L2nvtGXkJ++oXY82L9Ii+UtCkGT7hEzaTyWMmkOBH2uXejqmeQGxivVuqVzTpXi46WiCeOEIa; Received: from [168.91.17.188] (port=14524 helo=Nathan-GNU) by host393.hostmonster.com with esmtpsa (TLSv1.2:CAMELLIA128-SHA:128) (Exim 4.82) (envelope-from ) id 1WTVZD-00053V-NL for bug-gnu-emacs@gnu.org; Fri, 28 Mar 2014 06:07:23 -0600 From: Nathan Trapuzzano Date: Fri, 28 Mar 2014 08:07:20 -0400 Message-ID: <87txair0g7.fsf@ivytech.edu> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Identified-User: {1585:host393.hostmonster.com:nbtrapco:nbtrap.com} {sentby:smtp auth 168.91.17.188 authed with nbtrap@nbtrap.com} X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.3 (----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.3 (----) M-: (compare-strings "=CF=83" nil nil "=CF=82" nil nil t) =3D=3D> -1 ;; should be t Can someone that knows a thing about Unicode and emacs case tables speak to whether the latter could suffice for implementing full Unicode case folding? From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 28 Mar 2014 15:52:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Nathan Trapuzzano Cc: 17130@debbugs.gnu.org Reply-To: Eli Zaretskii Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.139602191430731 (code B ref 17130); Fri, 28 Mar 2014 15:52:02 +0000 Received: (at 17130) by debbugs.gnu.org; 28 Mar 2014 15:51:54 +0000 Received: from localhost ([127.0.0.1]:55023 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTZ4T-0007zZ-Kp for submit@debbugs.gnu.org; Fri, 28 Mar 2014 11:51:54 -0400 Received: from mtaout21.012.net.il ([80.179.55.169]:36512) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTZ4P-0007zO-LE for 17130@debbugs.gnu.org; Fri, 28 Mar 2014 11:51:51 -0400 Received: from conversion-daemon.a-mtaout21.012.net.il by a-mtaout21.012.net.il (HyperSendmail v2007.08) id <0N3500600L094K00@a-mtaout21.012.net.il> for 17130@debbugs.gnu.org; Fri, 28 Mar 2014 18:51:47 +0300 (IDT) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout21.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N35006U9LEB1Z40@a-mtaout21.012.net.il>; Fri, 28 Mar 2014 18:51:47 +0300 (IDT) Date: Fri, 28 Mar 2014 18:51:49 +0300 From: Eli Zaretskii In-reply-to: <87txair0g7.fsf@ivytech.edu> X-012-Sender: halo1@inter.net.il Message-id: <83fvm2fhii.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: 8BIT References: <87txair0g7.fsf@ivytech.edu> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Nathan Trapuzzano > Date: Fri, 28 Mar 2014 08:07:20 -0400 > > M-: (compare-strings "σ" nil nil "ς" nil nil t) > > ==> -1 ;; should be t No, because these characters are not a case pair. > Can someone that knows a thing about Unicode and emacs case tables speak > to whether the latter could suffice for implementing full Unicode case > folding? What is "full Unicode case folding"? From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: nbtrap@nbtrap.com Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 28 Mar 2014 19:32:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 17130@debbugs.gnu.org Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.139603508819781 (code B ref 17130); Fri, 28 Mar 2014 19:32:02 +0000 Received: (at 17130) by debbugs.gnu.org; 28 Mar 2014 19:31:28 +0000 Received: from localhost ([127.0.0.1]:55136 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTcUw-00058x-N2 for submit@debbugs.gnu.org; Fri, 28 Mar 2014 15:31:27 -0400 Received: from gproxy3-pub.mail.unifiedlayer.com ([69.89.30.42]:58159) by debbugs.gnu.org with smtp (Exim 4.80) (envelope-from ) id 1WTcUs-00058m-OX for 17130@debbugs.gnu.org; Fri, 28 Mar 2014 15:31:24 -0400 Received: (qmail 21881 invoked by uid 0); 28 Mar 2014 19:31:20 -0000 Received: from unknown (HELO cmgw4) (10.0.90.85) by gproxy3.mail.unifiedlayer.com with SMTP; 28 Mar 2014 19:31:20 -0000 Received: from host393.hostmonster.com ([66.147.240.193]) by cmgw4 with id jEXB1n0094B3kjm01EXEF6; Fri, 28 Mar 2014 20:31:18 -0600 X-Authority-Analysis: v=2.1 cv=L+eOHYj8 c=1 sm=1 tr=0 a=GZ6qK+eS4AuCRVUKGEKC+Q==:117 a=GZ6qK+eS4AuCRVUKGEKC+Q==:17 a=DsvgjBjRAAAA:8 a=f5113yIGAAAA:8 a=4GsTxW34auoA:10 a=2__L0ovz5gcA:10 a=lfvU_ReahkwA:10 a=IkcTkHD0fZMA:10 a=ngU5ixn2AAAA:8 a=fWyWhr6xdMwA:10 a=mDV3o1hIAAAA:8 a=te1EGT4yAAAA:8 a=uqpD3rmNKuwFWq-jgl4A:9 a=QEXdDO2ut3YA:10 a=OjCD1pd2LgkA:10 a=ii61gXl28gQA:10 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nbtrap.com; s=default; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID:In-Reply-To:Date:References:Subject:Cc:To:From; bh=sknh0FDU9o0ZVRtP2hYDAxjyE6d/eCueqXHp3LHnAcs=; b=cAYI5ZCbpfGI97Xc/Lt30dnPnW0e6s6qa/7rzkfSH0s1RoW0E4O/5r0MZuc3zrNU9HgxsjKCre/3L887XIa/6WinIwcGNnGb/LFgpPii/cciezlMO7ttbJjUhCMZnefm; Received: from [168.91.17.188] (port=16059 helo=Nathan-GNU) by host393.hostmonster.com with esmtpsa (TLSv1.2:CAMELLIA128-SHA:128) (Exim 4.82) (envelope-from ) id 1WTcUi-00006f-AN; Fri, 28 Mar 2014 13:31:12 -0600 From: nbtrap@nbtrap.com References: <87txair0g7.fsf@ivytech.edu> <83fvm2fhii.fsf@gnu.org> Date: Fri, 28 Mar 2014 15:31:09 -0400 In-Reply-To: <83fvm2fhii.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 28 Mar 2014 18:51:49 +0300") Message-ID: <87ob0qrugy.fsf@nbtrap.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Identified-User: {1585:host393.hostmonster.com:nbtrapco:nbtrap.com} {sentby:smtp auth 168.91.17.188 authed with nbtrap@nbtrap.com} X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Eli Zaretskii writes: >> M-: (compare-strings "=CF=83" nil nil "=CF=82" nil nil t) >>=20 >> =3D=3D> -1 ;; should be t > > No, because these characters are not a case pair. They're not a case pair in Emacs, but they should compare equally under Unicode case folding. >> Can someone that knows a thing about Unicode and emacs case tables speak >> to whether the latter could suffice for implementing full Unicode case >> folding? > > What is "full Unicode case folding"? Somthing that implements this: http://www.unicode.org/Public/UNIDATA/CaseFolding.txt And perhaps more. I don't know, but someone on this list probably does. If you look about a third of the way down, there's a line saying that U+03C2 (=CF=82) should fold into U+03C3 (=CF=83). From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Mar 2014 06:46:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: nbtrap@nbtrap.com Cc: 17130@debbugs.gnu.org Reply-To: Eli Zaretskii Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.139607551521653 (code B ref 17130); Sat, 29 Mar 2014 06:46:02 +0000 Received: (at 17130) by debbugs.gnu.org; 29 Mar 2014 06:45:15 +0000 Received: from localhost ([127.0.0.1]:55350 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTn10-0005dA-Ay for submit@debbugs.gnu.org; Sat, 29 Mar 2014 02:45:14 -0400 Received: from mtaout20.012.net.il ([80.179.55.166]:48885) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTn0u-0005cy-Gj for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 02:45:10 -0400 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0N3600D00QQ33Y00@a-mtaout20.012.net.il> for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 09:45:06 +0300 (IDT) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N3600C2VQR6GBB0@a-mtaout20.012.net.il>; Sat, 29 Mar 2014 09:45:06 +0300 (IDT) Date: Sat, 29 Mar 2014 09:45:10 +0300 From: Eli Zaretskii In-reply-to: <87ob0qrugy.fsf@nbtrap.com> X-012-Sender: halo1@inter.net.il Message-id: <83y4ztec5l.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: 8BIT References: <87txair0g7.fsf@ivytech.edu> <83fvm2fhii.fsf@gnu.org> <87ob0qrugy.fsf@nbtrap.com> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: nbtrap@nbtrap.com > Cc: 17130@debbugs.gnu.org > Date: Fri, 28 Mar 2014 15:31:09 -0400 > > Eli Zaretskii writes: > > >> M-: (compare-strings "σ" nil nil "ς" nil nil t) > >> > >> ==> -1 ;; should be t > > > > No, because these characters are not a case pair. > > They're not a case pair in Emacs, but they should compare equally under > Unicode case folding. Emacs doesn't currently support that. > >> Can someone that knows a thing about Unicode and emacs case tables speak > >> to whether the latter could suffice for implementing full Unicode case > >> folding? > > > > What is "full Unicode case folding"? > > Somthing that implements this: > http://www.unicode.org/Public/UNIDATA/CaseFolding.txt > > And perhaps more. I don't know, but someone on this list probably does. > > If you look about a third of the way down, there's a line saying that > U+03C2 (ς) should fold into U+03C3 (σ). Patches are welcome to import those tables into Emacs, and make case folding support them. From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: Nathan Trapuzzano Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Mar 2014 12:38:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 17130@debbugs.gnu.org Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.1396096676553 (code B ref 17130); Sat, 29 Mar 2014 12:38:02 +0000 Received: (at 17130) by debbugs.gnu.org; 29 Mar 2014 12:37:56 +0000 Received: from localhost ([127.0.0.1]:55467 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTsWK-00008q-49 for submit@debbugs.gnu.org; Sat, 29 Mar 2014 08:37:56 -0400 Received: from gproxy5-pub.mail.unifiedlayer.com ([67.222.38.55]:37549) by debbugs.gnu.org with smtp (Exim 4.80) (envelope-from ) id 1WTsWF-00008f-8u for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 08:37:52 -0400 Received: (qmail 3203 invoked by uid 0); 29 Mar 2014 12:37:49 -0000 Received: from unknown (HELO CMOut01) (10.0.90.82) by gproxy5.mail.unifiedlayer.com with SMTP; 29 Mar 2014 12:37:49 -0000 Received: from host393.hostmonster.com ([66.147.240.193]) by CMOut01 with id jQdg1n0074B3kjm01QdjCc; Sat, 29 Mar 2014 06:37:48 -0600 X-Authority-Analysis: v=2.1 cv=Re0DVTdv c=1 sm=1 tr=0 a=GZ6qK+eS4AuCRVUKGEKC+Q==:117 a=GZ6qK+eS4AuCRVUKGEKC+Q==:17 a=DsvgjBjRAAAA:8 a=f5113yIGAAAA:8 a=4GsTxW34auoA:10 a=2__L0ovz5gcA:10 a=lfvU_ReahkwA:10 a=IkcTkHD0fZMA:10 a=ngU5ixn2AAAA:8 a=fWyWhr6xdMwA:10 a=mDV3o1hIAAAA:8 a=te1EGT4yAAAA:8 a=A9qre1FtTrBf1eOCANAA:9 a=QEXdDO2ut3YA:10 a=ii61gXl28gQA:10 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nbtrap.com; s=default; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID:In-Reply-To:Date:References:Subject:Cc:To:From; bh=qNChZGKzxSWjTB1fSybp6dYVYQH03xGp0WI4Hlat5Cw=; b=CULFZBVFvMal5DxlReNBxE8e8fNVlw/waa54nokkojvVgn0N4w/GEtmb08yupJzVGr3ccgKAqp1gsCA4cHxLXnAsArAoxBZ8rhKo39AxQQYxTEjtGD/qiU2LkZLQ1PkW; Received: from [50.90.253.209] (port=43678 helo=Nathan-GNU) by host393.hostmonster.com with esmtpsa (TLSv1.2:CAMELLIA128-SHA:128) (Exim 4.82) (envelope-from ) id 1WTsW4-0004LP-Jt; Sat, 29 Mar 2014 06:37:40 -0600 From: Nathan Trapuzzano References: <87txair0g7.fsf@ivytech.edu> <83fvm2fhii.fsf@gnu.org> <87ob0qrugy.fsf@nbtrap.com> <83y4ztec5l.fsf@gnu.org> Date: Sat, 29 Mar 2014 08:37:35 -0400 In-Reply-To: <83y4ztec5l.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 29 Mar 2014 09:45:10 +0300") Message-ID: <87ob0pnptc.fsf@nbtrap.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Identified-User: {1585:host393.hostmonster.com:nbtrapco:nbtrap.com} {sentby:smtp auth 50.90.253.209 authed with nbtrap@nbtrap.com} X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Eli Zaretskii writes: >> > What is "full Unicode case folding"? >>=20 >> Somthing that implements this: >> http://www.unicode.org/Public/UNIDATA/CaseFolding.txt >>=20 >> And perhaps more. I don't know, but someone on this list probably does. >>=20 >> If you look about a third of the way down, there's a line saying that >> U+03C2 (=CF=82) should fold into U+03C3 (=CF=83). > > Patches are welcome to import those tables into Emacs, and make case > folding support them. Reading through the manual section on case tables, it seems that this could be supported via the extra "canonicalize" slot: CANONICALIZE The canonicalize table maps all of a set of case-related characters into a particular member of that set. If this isn't already used for Unicode case folding, what _is_ it used for? From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Mar 2014 13:16:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Nathan Trapuzzano Cc: 17130@debbugs.gnu.org Reply-To: Eli Zaretskii Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.13960989544394 (code B ref 17130); Sat, 29 Mar 2014 13:16:02 +0000 Received: (at 17130) by debbugs.gnu.org; 29 Mar 2014 13:15:54 +0000 Received: from localhost ([127.0.0.1]:55487 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTt73-00018n-N4 for submit@debbugs.gnu.org; Sat, 29 Mar 2014 09:15:54 -0400 Received: from mtaout21.012.net.il ([80.179.55.169]:36939) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTt70-00018X-25 for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 09:15:51 -0400 Received: from conversion-daemon.a-mtaout21.012.net.il by a-mtaout21.012.net.il (HyperSendmail v2007.08) id <0N3700C0084V0400@a-mtaout21.012.net.il> for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 16:15:48 +0300 (IDT) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout21.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N3700B9K8UCYJ30@a-mtaout21.012.net.il>; Sat, 29 Mar 2014 16:15:48 +0300 (IDT) Date: Sat, 29 Mar 2014 16:15:53 +0300 From: Eli Zaretskii In-reply-to: <87ob0pnptc.fsf@nbtrap.com> X-012-Sender: halo1@inter.net.il Message-id: <83d2h5du2e.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: 8BIT References: <87txair0g7.fsf@ivytech.edu> <83fvm2fhii.fsf@gnu.org> <87ob0qrugy.fsf@nbtrap.com> <83y4ztec5l.fsf@gnu.org> <87ob0pnptc.fsf@nbtrap.com> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Nathan Trapuzzano > Cc: 17130@debbugs.gnu.org > Date: Sat, 29 Mar 2014 08:37:35 -0400 > > Reading through the manual section on case tables, it seems that this > could be supported via the extra "canonicalize" slot: > > CANONICALIZE > The canonicalize table maps all of a set of case-related > characters into a particular member of that set. Not efficiently, no. E.g., how will you find ς from σ, using this method? Besides, don't we also need to know that ς can only be present at the end of a word? Or maybe I'm misunderstanding what you meant? > If this isn't already used for Unicode case folding, what _is_ it used > for? It is used for case-insensitive regexp matching, see search.c. From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: Nathan Trapuzzano Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Mar 2014 14:04:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 17130@debbugs.gnu.org Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.139610183814164 (code B ref 17130); Sat, 29 Mar 2014 14:04:02 +0000 Received: (at 17130) by debbugs.gnu.org; 29 Mar 2014 14:03:58 +0000 Received: from localhost ([127.0.0.1]:56284 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTtrY-0003gO-VE for submit@debbugs.gnu.org; Sat, 29 Mar 2014 10:03:57 -0400 Received: from gproxy4-pub.mail.unifiedlayer.com ([69.89.23.142]:53601) by debbugs.gnu.org with smtp (Exim 4.80) (envelope-from ) id 1WTtrS-0003g7-Pb for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 10:03:52 -0400 Received: (qmail 1852 invoked by uid 0); 29 Mar 2014 14:03:46 -0000 Received: from unknown (HELO cmgw3) (10.0.90.84) by gproxy4.mail.unifiedlayer.com with SMTP; 29 Mar 2014 14:03:46 -0000 Received: from host393.hostmonster.com ([66.147.240.193]) by cmgw3 with id jZ3c1n0094B3kjm01Z3fzY; Sat, 29 Mar 2014 15:03:44 -0600 X-Authority-Analysis: v=2.1 cv=O5+q4nNW c=1 sm=1 tr=0 a=GZ6qK+eS4AuCRVUKGEKC+Q==:117 a=GZ6qK+eS4AuCRVUKGEKC+Q==:17 a=DsvgjBjRAAAA:8 a=f5113yIGAAAA:8 a=4GsTxW34auoA:10 a=2__L0ovz5gcA:10 a=lfvU_ReahkwA:10 a=IkcTkHD0fZMA:10 a=ngU5ixn2AAAA:8 a=fWyWhr6xdMwA:10 a=mDV3o1hIAAAA:8 a=ux31zNp4dXR3TzTYOKMA:9 a=QEXdDO2ut3YA:10 a=ii61gXl28gQA:10 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nbtrap.com; s=default; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID:In-Reply-To:Date:References:Subject:Cc:To:From; bh=+q5HTBQlBtn+K2Coc+nVCAf6tgK/OgNhDuMVA28mc2c=; b=ggRZWps5g3mDL7MCWD4f9dnVI9Il96nCR88xUWWiKB+0yVWBgmdm3QQYTo5aSb4mEoG53x2MRAhJSdnAGk4Kwb2Yk+wovFBvUoy0XvcLOBMUPCgDgtBYFx8hf6vVz0Ve; Received: from [50.90.253.209] (port=43073 helo=Nathan-GNU) by host393.hostmonster.com with esmtpsa (TLSv1.2:CAMELLIA128-SHA:128) (Exim 4.82) (envelope-from ) id 1WTtrF-0002fe-Bb; Sat, 29 Mar 2014 08:03:37 -0600 From: Nathan Trapuzzano References: <87txair0g7.fsf@ivytech.edu> <83fvm2fhii.fsf@gnu.org> <87ob0qrugy.fsf@nbtrap.com> <83y4ztec5l.fsf@gnu.org> <87ob0pnptc.fsf@nbtrap.com> <83d2h5du2e.fsf@gnu.org> Date: Sat, 29 Mar 2014 10:03:32 -0400 In-Reply-To: <83d2h5du2e.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 29 Mar 2014 16:15:53 +0300") Message-ID: <87eh1lcdaj.fsf@nbtrap.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Identified-User: {1585:host393.hostmonster.com:nbtrapco:nbtrap.com} {sentby:smtp auth 50.90.253.209 authed with nbtrap@nbtrap.com} X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Eli Zaretskii writes: >> Reading through the manual section on case tables, it seems that this >> could be supported via the extra "canonicalize" slot: >>=20 >> CANONICALIZE >> The canonicalize table maps all of a set of case-related >> characters into a particular member of that set. > > Not efficiently, no. E.g., how will you find =CF=82 from =CF=83, using t= his > method? =CF=83, =CF=82, and =CE=A3 would all have =CF=83 in the CANONICALIZE slot, = since they all fold to =CF=83. (By the way, =CF=82 should upcase to =CE=A3--that much I k= now the case tables can handle.) > Besides, don't we also need to know that =CF=82 can only be present at the > end of a word? Don't think so. AFAIK, Unicode says nothing about ordering except when it comes to combining characters. But even it did prescribe such a rule, I don't think it would have anything to do with case folding. >> If this isn't already used for Unicode case folding, what _is_ it used >> for? > > It is used for case-insensitive regexp matching, see search.c. Right, but what I'm asking is: if Emacs doesn't do Unicode case folding, what is the purpose of the CANONICALIZE slot except as a kind of placeholder that gets autofilled? Are there other kinds of case folding--other than traditional upper/lower and Unicode--that I'm not aware of? I understand that Emacs autofills the CANONICALIZE slot from the other slots, but only when the CANONICALIZE slot is not already set to non-nil. What if the CANONICALIZE slot on =CF=82 were set to =CF=83? I= think that's all that would have to happen for the Unicode folding to work. It seems the machinery is already in place. From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Mar 2014 14:46:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Nathan Trapuzzano Cc: 17130@debbugs.gnu.org Reply-To: Eli Zaretskii Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.139610435618342 (code B ref 17130); Sat, 29 Mar 2014 14:46:02 +0000 Received: (at 17130) by debbugs.gnu.org; 29 Mar 2014 14:45:56 +0000 Received: from localhost ([127.0.0.1]:56290 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTuWB-0004ll-0m for submit@debbugs.gnu.org; Sat, 29 Mar 2014 10:45:55 -0400 Received: from mtaout25.012.net.il ([80.179.55.181]:57670) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTuW5-0004lZ-Mk for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 10:45:51 -0400 Received: from conversion-daemon.mtaout25.012.net.il by mtaout25.012.net.il (HyperSendmail v2007.08) id <0N3700N00CEAHC00@mtaout25.012.net.il> for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 17:44:26 +0300 (IDT) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout25.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N3700E7FCY279B0@mtaout25.012.net.il>; Sat, 29 Mar 2014 17:44:26 +0300 (IDT) Date: Sat, 29 Mar 2014 17:45:47 +0300 From: Eli Zaretskii In-reply-to: <87eh1lcdaj.fsf@nbtrap.com> X-012-Sender: halo1@inter.net.il Message-id: <838urtdpwk.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: 8BIT References: <87txair0g7.fsf@ivytech.edu> <83fvm2fhii.fsf@gnu.org> <87ob0qrugy.fsf@nbtrap.com> <83y4ztec5l.fsf@gnu.org> <87ob0pnptc.fsf@nbtrap.com> <83d2h5du2e.fsf@gnu.org> <87eh1lcdaj.fsf@nbtrap.com> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Nathan Trapuzzano > Cc: 17130@debbugs.gnu.org > Date: Sat, 29 Mar 2014 10:03:32 -0400 > > Eli Zaretskii writes: > > >> Reading through the manual section on case tables, it seems that this > >> could be supported via the extra "canonicalize" slot: > >> > >> CANONICALIZE > >> The canonicalize table maps all of a set of case-related > >> characters into a particular member of that set. > > > > Not efficiently, no. E.g., how will you find ς from σ, using this > > method? > > σ, ς, and Σ would all have σ in the CANONICALIZE slot, since they all > fold to σ. So you would need to search all characters to find those which have σ in the CANONICALIZE slot -- not very efficient, to say the least. IOW, what you suggest will provide a one-way mapping, whereas we need a two-way mapping. > > Besides, don't we also need to know that ς can only be present at the > > end of a word? > > Don't think so. AFAIK, Unicode says nothing about ordering except when > it comes to combining characters. But even it did prescribe such a > rule, I don't think it would have anything to do with case folding. Who said this is only about case folding? Emacs should use this data for up-casing and down-casing as well, for example, so that M-l downcases Σ to ς, not σ, when it is at the end of the word. Wouldn't users of Greek expect that? > >> If this isn't already used for Unicode case folding, what _is_ it used > >> for? > > > > It is used for case-insensitive regexp matching, see search.c. > > Right, but what I'm asking is: if Emacs doesn't do Unicode case folding, > what is the purpose of the CANONICALIZE slot except as a kind of > placeholder that gets autofilled? Whenever you need the canonical equivalent of a character, such as in case-insensitive search, you need that slot. > Are there other kinds of case folding--other than traditional > upper/lower and Unicode--that I'm not aware of? There's "title case", of course. There are also characters whose case pair is not a single character, but several, like the upper-case variant of ß in German. Basically, any character not marked "C" in the Unicode CaseFolding.txt is special in some way. > I understand that Emacs autofills the CANONICALIZE slot from > the other slots, but only when the CANONICALIZE slot is not already set > to non-nil. What if the CANONICALIZE slot on ς were set to σ? I think > that's all that would have to happen for the Unicode folding to work. > It seems the machinery is already in place. For this case, maybe (and even it doesn't handle Σ correctly, I think, when downcased at the end of the word). For other cases, not necessarily. Personally, I think we need an additional slot for what you want, and code to use it. From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: Nathan Trapuzzano Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Mar 2014 15:31:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 17130@debbugs.gnu.org Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.139610701122727 (code B ref 17130); Sat, 29 Mar 2014 15:31:02 +0000 Received: (at 17130) by debbugs.gnu.org; 29 Mar 2014 15:30:11 +0000 Received: from localhost ([127.0.0.1]:56304 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTvCz-0005uT-Fk for submit@debbugs.gnu.org; Sat, 29 Mar 2014 11:30:10 -0400 Received: from gproxy4-pub.mail.unifiedlayer.com ([69.89.23.142]:34315) by debbugs.gnu.org with smtp (Exim 4.80) (envelope-from ) id 1WTvCm-0005sl-Ui for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 11:30:06 -0400 Received: (qmail 24545 invoked by uid 0); 29 Mar 2014 15:29:54 -0000 Received: from unknown (HELO CMOut01) (10.0.90.82) by gproxy4.mail.unifiedlayer.com with SMTP; 29 Mar 2014 15:29:54 -0000 Received: from host393.hostmonster.com ([66.147.240.193]) by CMOut01 with id jTVm1n00Q4B3kjm01TVp8R; Sat, 29 Mar 2014 09:29:54 -0600 X-Authority-Analysis: v=2.1 cv=Re0DVTdv c=1 sm=1 tr=0 a=GZ6qK+eS4AuCRVUKGEKC+Q==:117 a=GZ6qK+eS4AuCRVUKGEKC+Q==:17 a=DsvgjBjRAAAA:8 a=f5113yIGAAAA:8 a=4GsTxW34auoA:10 a=2__L0ovz5gcA:10 a=lfvU_ReahkwA:10 a=IkcTkHD0fZMA:10 a=ngU5ixn2AAAA:8 a=fWyWhr6xdMwA:10 a=mDV3o1hIAAAA:8 a=szCK5NzadJ1RrQgn1dQA:9 a=QEXdDO2ut3YA:10 a=ii61gXl28gQA:10 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nbtrap.com; s=default; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID:In-Reply-To:Date:References:Subject:Cc:To:From; bh=B+A/vfFz5gY7seRj9gXnC1EvSPsjlnyPAP+g4ZuneVM=; b=OsM1X8yn3WoeMZmDiQC6bwE0jfGUV/RQwh0zsJ7Kyf5hwBhBkNIL1VR1xXKOOzElNm2GiOROnV/qLCpw6LubcaZMn30SWIIhPhUK2Del5RC0IwUnDIHTktF/NlByergS; Received: from [50.90.253.209] (port=43486 helo=Nathan-GNU) by host393.hostmonster.com with esmtpsa (TLSv1.2:CAMELLIA128-SHA:128) (Exim 4.82) (envelope-from ) id 1WTvCc-0007Yp-P7; Sat, 29 Mar 2014 09:29:46 -0600 From: Nathan Trapuzzano References: <87txair0g7.fsf@ivytech.edu> <83fvm2fhii.fsf@gnu.org> <87ob0qrugy.fsf@nbtrap.com> <83y4ztec5l.fsf@gnu.org> <87ob0pnptc.fsf@nbtrap.com> <83d2h5du2e.fsf@gnu.org> <87eh1lcdaj.fsf@nbtrap.com> <838urtdpwk.fsf@gnu.org> Date: Sat, 29 Mar 2014 11:29:43 -0400 In-Reply-To: <838urtdpwk.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 29 Mar 2014 17:45:47 +0300") Message-ID: <87ioqxxbtk.fsf@nbtrap.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Identified-User: {1585:host393.hostmonster.com:nbtrapco:nbtrap.com} {sentby:smtp auth 50.90.253.209 authed with nbtrap@nbtrap.com} X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Eli Zaretskii writes: >> =CF=83, =CF=82, and =CE=A3 would all have =CF=83 in the CANONICALIZE slo= t, since they all >> fold to =CF=83. > > So you would need to search all characters to find those which have =CF=83 > in the CANONICALIZE slot -- not very efficient, to say the least. Doesn't this already happen? If not, then what is the CANONICALIZE slot doing that couldn't be done with the regular upcase/downcase slots by themselves? > IOW, what you suggest will provide a one-way mapping, whereas we need > a two-way mapping. Not sure I follow. Seems to me the CANONICALIZE slot is sufficient, at least in principle. >> > Besides, don't we also need to know that =CF=82 can only be present at= the >> > end of a word? >>=20 >> Don't think so. AFAIK, Unicode says nothing about ordering except when >> it comes to combining characters. But even it did prescribe such a >> rule, I don't think it would have anything to do with case folding. > > Who said this is only about case folding? I should have said just "case", not "case folding". > Emacs should use this data for up-casing and down-casing as well, for > example, so that M-l downcases =CE=A3 to =CF=82, not =CF=83, when it is a= t the end of > the word. Wouldn't users of Greek expect that? Maybe. I'm just saying that Unicode itself doesn't prescribe or even recommend such behavior. It defines case conversions independently of ordering. That said, making M-l downcase terminal =CE=A3 to =CF=82 would be a nice fe= ature that could be enabled, e.g., by enabling a minor mode or by modifying some *-functions variable of functions that get called before the normal behavior of M-l is applied, etc. But it shouldn't have anything to do with Unicode-compliant case-insensitive searching. >> Right, but what I'm asking is: if Emacs doesn't do Unicode case folding, >> what is the purpose of the CANONICALIZE slot except as a kind of >> placeholder that gets autofilled? > > Whenever you need the canonical equivalent of a character, such as in > case-insensitive search, you need that slot. But there's nothing about the slot that mandates that only _pairs_ can be case-equivalent under case folding. Indeed, the manual speaks of "sets" of chracters that might be equivalent under case-folding, hence my understanding that =CF=83, =CF=82, and =CE=A3 can all have =CF=83 in the= ir CANONICALIZE slot, and that's all it would take. (Btw, I'm using "case-insensitive" to mean the same as "under case-folding".) >> Are there other kinds of case folding--other than traditional >> upper/lower and Unicode--that I'm not aware of? > > There's "title case", of course.=20=20 I think title case would require an extra slot in the case table. > There are also characters whose case pair is not a single character, > but several, like the upper-case variant of =C3=9F in German. Good point. "=C3=9F" should fold to "ss". I guess for the CANONICALIZE sl= ot to suffice, it would have to map to a string, not a code point. > Personally, I think we need an additional slot for what you want, and > code to use it. Given the point about =C3=9F, you're probably right. Unless we can make entries in the CANONICALIZE slot be strings rather than code points. From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Mar 2014 17:38:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Nathan Trapuzzano Cc: 17130@debbugs.gnu.org Reply-To: Eli Zaretskii Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.13961146642612 (code B ref 17130); Sat, 29 Mar 2014 17:38:02 +0000 Received: (at 17130) by debbugs.gnu.org; 29 Mar 2014 17:37:44 +0000 Received: from localhost ([127.0.0.1]:56345 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTxCQ-0000g3-Od for submit@debbugs.gnu.org; Sat, 29 Mar 2014 13:37:43 -0400 Received: from mtaout26.012.net.il ([80.179.55.182]:43224) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTxCO-0000fs-0o for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 13:37:41 -0400 Received: from conversion-daemon.mtaout26.012.net.il by mtaout26.012.net.il (HyperSendmail v2007.08) id <0N3700700KDF2400@mtaout26.012.net.il> for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 20:36:29 +0300 (IDT) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout26.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N37003YSKWTB650@mtaout26.012.net.il>; Sat, 29 Mar 2014 20:36:29 +0300 (IDT) Date: Sat, 29 Mar 2014 20:37:38 +0300 From: Eli Zaretskii In-reply-to: <87ioqxxbtk.fsf@nbtrap.com> X-012-Sender: halo1@inter.net.il Message-id: <831txkewil.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: 8BIT References: <87txair0g7.fsf@ivytech.edu> <83fvm2fhii.fsf@gnu.org> <87ob0qrugy.fsf@nbtrap.com> <83y4ztec5l.fsf@gnu.org> <87ob0pnptc.fsf@nbtrap.com> <83d2h5du2e.fsf@gnu.org> <87eh1lcdaj.fsf@nbtrap.com> <838urtdpwk.fsf@gnu.org> <87ioqxxbtk.fsf@nbtrap.com> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Nathan Trapuzzano > Cc: 17130@debbugs.gnu.org > Date: Sat, 29 Mar 2014 11:29:43 -0400 > > Eli Zaretskii writes: > > >> σ, ς, and Σ would all have σ in the CANONICALIZE slot, since they all > >> fold to σ. > > > > So you would need to search all characters to find those which have σ > > in the CANONICALIZE slot -- not very efficient, to say the least. > > Doesn't this already happen? No, not when that slot is used for case-insensitive search. You just use it to get the canonical equivalent, i.e. use the one-way mapping that it provides. > If not, then what is the CANONICALIZE slot doing that couldn't be > done with the regular upcase/downcase slots by themselves? If that slot is "trivial", i.e. contains the lower-case variant of the character, then indeed this slot doesn't add information, I think, only utility. But it doesn't have to contain the lower-case variant. > > IOW, what you suggest will provide a one-way mapping, whereas we need > > a two-way mapping. > > Not sure I follow. Seems to me the CANONICALIZE slot is sufficient, at > least in principle. It is sufficient for mapping a character to its canonical equivalent, but not finding the non-canonical variants of a canonical character. IOW, it is not well suited to finding ς given just σ. > > Emacs should use this data for up-casing and down-casing as well, for > > example, so that M-l downcases Σ to ς, not σ, when it is at the end of > > the word. Wouldn't users of Greek expect that? > > Maybe. I'm just saying that Unicode itself doesn't prescribe or even > recommend such behavior. It defines case conversions independently of > ordering. > > That said, making M-l downcase terminal Σ to ς would be a nice feature > that could be enabled, e.g., by enabling a minor mode or by modifying > some *-functions variable of functions that get called before the normal > behavior of M-l is applied, etc. But it shouldn't have anything to do > with Unicode-compliant case-insensitive searching. For searching, you only need the CANONICALIZE slot. But what about replacing the search string while keeping the letter case in the replacement? For that, CANONICALIZE alone is not enough, you need the reverse mapping. > > Personally, I think we need an additional slot for what you want, and > > code to use it. > > Given the point about ß, you're probably right. Unless we can make > entries in the CANONICALIZE slot be strings rather than code points. This is Lisp; a vector slot can contain any Lisp object. But using CANONICALIZE for what you want would be wrong, I think, because it will screw up case-insensitive search, which expects to find there a single character. From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: Nathan Trapuzzano Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Mar 2014 18:33:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 17130@debbugs.gnu.org Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.139611793412330 (code B ref 17130); Sat, 29 Mar 2014 18:33:02 +0000 Received: (at 17130) by debbugs.gnu.org; 29 Mar 2014 18:32:14 +0000 Received: from localhost ([127.0.0.1]:56381 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTy3A-0003Ci-Fg for submit@debbugs.gnu.org; Sat, 29 Mar 2014 14:32:13 -0400 Received: from gproxy3-pub.mail.unifiedlayer.com ([69.89.30.42]:40030) by debbugs.gnu.org with smtp (Exim 4.80) (envelope-from ) id 1WTy35-0003CV-8z for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 14:32:09 -0400 Received: (qmail 26185 invoked by uid 0); 29 Mar 2014 18:32:05 -0000 Received: from unknown (HELO cmgw3) (10.0.90.84) by gproxy3.mail.unifiedlayer.com with SMTP; 29 Mar 2014 18:32:05 -0000 Received: from host393.hostmonster.com ([66.147.240.193]) by cmgw3 with id jdXu1n00W4B3kjm01dXxZH; Sat, 29 Mar 2014 19:32:03 -0600 X-Authority-Analysis: v=2.1 cv=O5+q4nNW c=1 sm=1 tr=0 a=GZ6qK+eS4AuCRVUKGEKC+Q==:117 a=GZ6qK+eS4AuCRVUKGEKC+Q==:17 a=DsvgjBjRAAAA:8 a=f5113yIGAAAA:8 a=4GsTxW34auoA:10 a=2__L0ovz5gcA:10 a=lfvU_ReahkwA:10 a=IkcTkHD0fZMA:10 a=ngU5ixn2AAAA:8 a=fWyWhr6xdMwA:10 a=mDV3o1hIAAAA:8 a=te1EGT4yAAAA:8 a=l8tov7semGp3CIHwoYQA:9 a=4012WQjoIh5nzd0e:21 a=ZzfYYkI4sY-AK2ry:21 a=QEXdDO2ut3YA:10 a=ii61gXl28gQA:10 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nbtrap.com; s=default; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID:In-Reply-To:Date:References:Subject:Cc:To:From; bh=eoHU66cqGA6JaDYXPOKj543YH/zLH5t6hHydDCh8PQA=; b=dQv+PH4EiiR/LUxNisuUSo1c8+Z50aD4EqF69myjaxkRUYkmjJczKdCZUlu8/aEnEQdsMokE7uy+LJ3Kqd6QuK2A8i/kJJGWoh4DF8v8O6PS0Rsj7uu7tjox74VEcZXj; Received: from [50.90.253.209] (port=51410 helo=Nathan-GNU) by host393.hostmonster.com with esmtpsa (TLSv1.2:CAMELLIA128-SHA:128) (Exim 4.82) (envelope-from ) id 1WTy2t-0000sk-R6; Sat, 29 Mar 2014 12:31:56 -0600 From: Nathan Trapuzzano References: <87txair0g7.fsf@ivytech.edu> <83fvm2fhii.fsf@gnu.org> <87ob0qrugy.fsf@nbtrap.com> <83y4ztec5l.fsf@gnu.org> <87ob0pnptc.fsf@nbtrap.com> <83d2h5du2e.fsf@gnu.org> <87eh1lcdaj.fsf@nbtrap.com> <838urtdpwk.fsf@gnu.org> <87ioqxxbtk.fsf@nbtrap.com> <831txkewil.fsf@gnu.org> Date: Sat, 29 Mar 2014 14:31:52 -0400 In-Reply-To: <831txkewil.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 29 Mar 2014 20:37:38 +0300") Message-ID: <8761mwua93.fsf@nbtrap.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Identified-User: {1585:host393.hostmonster.com:nbtrapco:nbtrap.com} {sentby:smtp auth 50.90.253.209 authed with nbtrap@nbtrap.com} X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Eli Zaretskii writes: >> > So you would need to search all characters to find those which have = =CF=83 >> > in the CANONICALIZE slot -- not very efficient, to say the least. >>=20 >> Doesn't this already happen? > > No, not when that slot is used for case-insensitive search. You just > use it to get the canonical equivalent, i.e. use the one-way mapping > that it provides. I still don't get it. What I say below may explain why. >> If not, then what is the CANONICALIZE slot doing that couldn't be >> done with the regular upcase/downcase slots by themselves? > > If that slot is "trivial", i.e. contains the lower-case variant of the > character, then indeed this slot doesn't add information, I think, > only utility. But it doesn't have to contain the lower-case variant. I know. But if Emacs doesn't do Unicode folding, what is there other than lower/upper variants? >> > IOW, what you suggest will provide a one-way mapping, whereas we need >> > a two-way mapping. >>=20 >> Not sure I follow. Seems to me the CANONICALIZE slot is sufficient, at >> least in principle. > > It is sufficient for mapping a character to its canonical equivalent, > but not finding the non-canonical variants of a canonical character. > IOW, it is not well suited to finding =CF=82 given just =CF=83. Finding the non-canonical variants is not something that happens (at least in principle) during case-insensitive matching. You convert both the matching string and the string being matched into their canonical equivalents and see if they match. You never UNfold. Case folding is by definition a one-way operation. >> That said, making M-l downcase terminal =CE=A3 to =CF=82 would be a nice= feature >> that could be enabled, e.g., by enabling a minor mode or by modifying >> some *-functions variable of functions that get called before the normal >> behavior of M-l is applied, etc. But it shouldn't have anything to do >> with Unicode-compliant case-insensitive searching. > > For searching, you only need the CANONICALIZE slot. But what about > replacing the search string while keeping the letter case in the > replacement? For that, CANONICALIZE alone is not enough, you need the > reverse mapping. There is no reverse mapping when it comes to folding. There can't be, since multiple characters can fold into the same character. I don't fully understand what "case-replace" does (e.g. case being a property of characters and not strings, what does it mean to "preserve case" when replacing a string of length x with a string of length y where x !=3D y), but I don't think Unicode folding would complicate it. There are three cases in Unicode: lower, upper, and title. Upper and title already overlap for the vast majority of codepoints, so there you already have problems with a case-preserving replace. That said "fold" is not a case in Unicode; it's a one-way mapping of non-overlapping sets of characters to a canonical equivalent, so it makes no sense to talk about preserving case with respect to case folding. Notandum: I was wrong about Unicode saying nothing about character ordering for non-combining characters. The "special casing" document (ftp://ftp.unicode.org/Public/UCD/latest/ucd/SpecialCasing.txt) contains context- and language- dependent case rules for certain characters, including final sigma. Notably, the document says that =CE=A3 in terminal position should (or "may"--I'm not really sure about how to interpret the document) downcase to =CF=82. That said, the document has _nothing_ to do with case _folding_, which is always context- and language- independent. Rightly interpreted, therefore, case _conversion_ (such as in case-preserving replace) and case-insensitive _searching_ (i.e. case folding), according to Unicode, are orthogonal. We don't have to address both at the same time. >> Given the point about =C3=9F, you're probably right. Unless we can make >> entries in the CANONICALIZE slot be strings rather than code points. > > This is Lisp; a vector slot can contain any Lisp object. But using > CANONICALIZE for what you want would be wrong, I think, because it > will screw up case-insensitive search, which expects to find there a > single character. Right, that's what I meant. Putting strings there would break something. From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: Nathan Trapuzzano Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Mar 2014 18:37:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 17130@debbugs.gnu.org Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.139611821912759 (code B ref 17130); Sat, 29 Mar 2014 18:37:01 +0000 Received: (at 17130) by debbugs.gnu.org; 29 Mar 2014 18:36:59 +0000 Received: from localhost ([127.0.0.1]:56385 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTy7m-0003Ji-2b for submit@debbugs.gnu.org; Sat, 29 Mar 2014 14:36:58 -0400 Received: from gproxy4-pub.mail.unifiedlayer.com ([69.89.23.142]:54310) by debbugs.gnu.org with smtp (Exim 4.80) (envelope-from ) id 1WTy7i-0003JU-AV for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 14:36:55 -0400 Received: (qmail 31939 invoked by uid 0); 29 Mar 2014 18:36:51 -0000 Received: from unknown (HELO CMOut01) (10.0.90.82) by gproxy4.mail.unifiedlayer.com with SMTP; 29 Mar 2014 18:36:51 -0000 Received: from host393.hostmonster.com ([66.147.240.193]) by CMOut01 with id jWcl1n0034B3kjm01WcoFM; Sat, 29 Mar 2014 12:36:50 -0600 X-Authority-Analysis: v=2.1 cv=Re0DVTdv c=1 sm=1 tr=0 a=GZ6qK+eS4AuCRVUKGEKC+Q==:117 a=GZ6qK+eS4AuCRVUKGEKC+Q==:17 a=DsvgjBjRAAAA:8 a=f5113yIGAAAA:8 a=4GsTxW34auoA:10 a=2__L0ovz5gcA:10 a=lfvU_ReahkwA:10 a=ngU5ixn2AAAA:8 a=fWyWhr6xdMwA:10 a=db35BGRXnkFSxFIHEGAA:9 a=T1GfT9_SXI0A:10 a=KszyucsviKYA:10 a=OFOBjfaQPycA:10 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nbtrap.com; s=default; h=Content-Type:MIME-Version:Message-ID:In-Reply-To:Date:References:Subject:Cc:To:From; bh=fbYpX+rFqcCgqW5e9yMfebTdnd7h9BnVlYWcBJ72DHY=; b=mrpSAoEgy9rKGCOOg/0xF0p4lB/ug+6ZH+12+pTS7IHZgGqQ1C8Y4iIA/ljESRFr8NTMsDFE2eV9+G5f1UgpUvGcU5KMnUwvatR7t8yzibF885ZtvHpb3Bc1CJSW//6g; Received: from [50.90.253.209] (port=51483 helo=Nathan-GNU) by host393.hostmonster.com with esmtpsa (TLSv1.2:CAMELLIA128-SHA:128) (Exim 4.82) (envelope-from ) id 1WTy7Z-0005Ws-Gb; Sat, 29 Mar 2014 12:36:45 -0600 From: Nathan Trapuzzano References: <87txair0g7.fsf@ivytech.edu> <83fvm2fhii.fsf@gnu.org> <87ob0qrugy.fsf@nbtrap.com> <83y4ztec5l.fsf@gnu.org> <87ob0pnptc.fsf@nbtrap.com> <83d2h5du2e.fsf@gnu.org> <87eh1lcdaj.fsf@nbtrap.com> <838urtdpwk.fsf@gnu.org> <87ioqxxbtk.fsf@nbtrap.com> <831txkewil.fsf@gnu.org> <8761mwua93.fsf@nbtrap.com> Date: Sat, 29 Mar 2014 14:36:42 -0400 In-Reply-To: <8761mwua93.fsf@nbtrap.com> (Nathan Trapuzzano's message of "Sat, 29 Mar 2014 14:31:52 -0400") Message-ID: <87eh1kdf7p.fsf@nbtrap.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Identified-User: {1585:host393.hostmonster.com:nbtrapco:nbtrap.com} {sentby:smtp auth 50.90.253.209 authed with nbtrap@nbtrap.com} X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Nathan Trapuzzano writes: > Rightly interpreted, therefore, case _conversion_ (such as in > case-preserving replace) and case-insensitive _searching_ (i.e. case > folding), according to Unicode, are orthogonal. We don't have to > address both at the same time. Er, let me rephrase. Case _conversion_ (such as in case-preserving replace) and case _folding_ (such as ought be used in case-insensitive searching) are orthogonal. From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Mar 2014 19:51:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Nathan Trapuzzano Cc: 17130@debbugs.gnu.org Reply-To: Eli Zaretskii Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.139612264419961 (code B ref 17130); Sat, 29 Mar 2014 19:51:01 +0000 Received: (at 17130) by debbugs.gnu.org; 29 Mar 2014 19:50:44 +0000 Received: from localhost ([127.0.0.1]:56419 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTzH9-0005Bo-L9 for submit@debbugs.gnu.org; Sat, 29 Mar 2014 15:50:43 -0400 Received: from mtaout25.012.net.il ([80.179.55.181]:35343) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTzH7-0005BX-1Q for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 15:50:42 -0400 Received: from conversion-daemon.mtaout25.012.net.il by mtaout25.012.net.il (HyperSendmail v2007.08) id <0N3700E00QZEKJ00@mtaout25.012.net.il> for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 22:49:17 +0300 (IDT) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout25.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N3700A9WR25Q540@mtaout25.012.net.il>; Sat, 29 Mar 2014 22:49:17 +0300 (IDT) Date: Sat, 29 Mar 2014 22:50:40 +0300 From: Eli Zaretskii In-reply-to: <8761mwua93.fsf@nbtrap.com> X-012-Sender: halo1@inter.net.il Message-id: <83vbuwdbsf.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: 8BIT References: <87txair0g7.fsf@ivytech.edu> <83fvm2fhii.fsf@gnu.org> <87ob0qrugy.fsf@nbtrap.com> <83y4ztec5l.fsf@gnu.org> <87ob0pnptc.fsf@nbtrap.com> <83d2h5du2e.fsf@gnu.org> <87eh1lcdaj.fsf@nbtrap.com> <838urtdpwk.fsf@gnu.org> <87ioqxxbtk.fsf@nbtrap.com> <831txkewil.fsf@gnu.org> <8761mwua93.fsf@nbtrap.com> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Nathan Trapuzzano > Cc: 17130@debbugs.gnu.org > Date: Sat, 29 Mar 2014 14:31:52 -0400 > > >> If not, then what is the CANONICALIZE slot doing that couldn't be > >> done with the regular upcase/downcase slots by themselves? > > > > If that slot is "trivial", i.e. contains the lower-case variant of the > > character, then indeed this slot doesn't add information, I think, > > only utility. But it doesn't have to contain the lower-case variant. > > I know. But if Emacs doesn't do Unicode folding, what is there other > than lower/upper variants? You can make it have whatever you like, because you can set up buffer-specific tables. > >> Not sure I follow. Seems to me the CANONICALIZE slot is sufficient, at > >> least in principle. > > > > It is sufficient for mapping a character to its canonical equivalent, > > but not finding the non-canonical variants of a canonical character. > > IOW, it is not well suited to finding ς given just σ. > > Finding the non-canonical variants is not something that happens (at > least in principle) during case-insensitive matching. The case database is not only for searching. > > For searching, you only need the CANONICALIZE slot. But what about > > replacing the search string while keeping the letter case in the > > replacement? For that, CANONICALIZE alone is not enough, you need the > > reverse mapping. > > There is no reverse mapping when it comes to folding. There can't be, > since multiple characters can fold into the same character. You can use the case of the string being replaced as guidelines. E.g., if the replaced string was capitalized, you can capitalize the replacement. From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Mar 2014 19:52:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Nathan Trapuzzano Cc: 17130@debbugs.gnu.org Reply-To: Eli Zaretskii Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.139612268420047 (code B ref 17130); Sat, 29 Mar 2014 19:52:02 +0000 Received: (at 17130) by debbugs.gnu.org; 29 Mar 2014 19:51:24 +0000 Received: from localhost ([127.0.0.1]:56423 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTzHo-0005DH-6o for submit@debbugs.gnu.org; Sat, 29 Mar 2014 15:51:24 -0400 Received: from mtaout23.012.net.il ([80.179.55.175]:64285) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTzHl-0005D5-Bj for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 15:51:22 -0400 Received: from conversion-daemon.a-mtaout23.012.net.il by a-mtaout23.012.net.il (HyperSendmail v2007.08) id <0N3700F00QLLN100@a-mtaout23.012.net.il> for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 22:51:20 +0300 (IDT) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout23.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N3700FKCR5JIFA0@a-mtaout23.012.net.il>; Sat, 29 Mar 2014 22:51:20 +0300 (IDT) Date: Sat, 29 Mar 2014 22:51:20 +0300 From: Eli Zaretskii In-reply-to: <87eh1kdf7p.fsf@nbtrap.com> X-012-Sender: halo1@inter.net.il Message-id: <83txagdbrb.fsf@gnu.org> References: <87txair0g7.fsf@ivytech.edu> <83fvm2fhii.fsf@gnu.org> <87ob0qrugy.fsf@nbtrap.com> <83y4ztec5l.fsf@gnu.org> <87ob0pnptc.fsf@nbtrap.com> <83d2h5du2e.fsf@gnu.org> <87eh1lcdaj.fsf@nbtrap.com> <838urtdpwk.fsf@gnu.org> <87ioqxxbtk.fsf@nbtrap.com> <831txkewil.fsf@gnu.org> <8761mwua93.fsf@nbtrap.com> <87eh1kdf7p.fsf@nbtrap.com> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Nathan Trapuzzano > Cc: 17130@debbugs.gnu.org > Date: Sat, 29 Mar 2014 14:36:42 -0400 > > Er, let me rephrase. Case _conversion_ (such as in case-preserving > replace) and case _folding_ (such as ought be used in case-insensitive > searching) are orthogonal. But they can very well use the same database. From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding In-Reply-To: <87txair0g7.fsf@ivytech.edu> Resent-From: Nathan Trapuzzano Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Mar 2014 20:05:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 17130@debbugs.gnu.org Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.139612347521439 (code B ref 17130); Sat, 29 Mar 2014 20:05:01 +0000 Received: (at 17130) by debbugs.gnu.org; 29 Mar 2014 20:04:35 +0000 Received: from localhost ([127.0.0.1]:56451 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTzUY-0005Zg-Oh for submit@debbugs.gnu.org; Sat, 29 Mar 2014 16:04:35 -0400 Received: from gproxy5-pub.mail.unifiedlayer.com ([67.222.38.55]:60639) by debbugs.gnu.org with smtp (Exim 4.80) (envelope-from ) id 1WTzUU-0005ZU-TH for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 16:04:31 -0400 Received: (qmail 11327 invoked by uid 0); 29 Mar 2014 20:04:27 -0000 Received: from unknown (HELO cmgw3) (10.0.90.84) by gproxy5.mail.unifiedlayer.com with SMTP; 29 Mar 2014 20:04:27 -0000 Received: from host393.hostmonster.com ([66.147.240.193]) by cmgw3 with id jf4M1n0064B3kjm01f4QZ6; Sat, 29 Mar 2014 21:04:26 -0600 X-Authority-Analysis: v=2.1 cv=O5+q4nNW c=1 sm=1 tr=0 a=GZ6qK+eS4AuCRVUKGEKC+Q==:117 a=GZ6qK+eS4AuCRVUKGEKC+Q==:17 a=DsvgjBjRAAAA:8 a=f5113yIGAAAA:8 a=4GsTxW34auoA:10 a=2__L0ovz5gcA:10 a=lfvU_ReahkwA:10 a=ngU5ixn2AAAA:8 a=fWyWhr6xdMwA:10 a=mDV3o1hIAAAA:8 a=batyja8vmt6ZtepvQCgA:9 a=ii61gXl28gQA:10 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nbtrap.com; s=default; h=Content-Type:MIME-Version:Message-ID:References:Date:Subject:Cc:To:From; bh=JUO9rz3rtIzZROfbOBEmjpqP3tV5Jzwx4DwLFzn4WBw=; b=Qo4qi90f2YxZaDEUDWeBDWHio6QyzOdRTSfVKH0+JTV95OfFJmuiMr02JuP4ZjXp6tydzAKQi4DVKKFHDAwnHtcmFk80u7WHWkb7wFZucLrE9BLFiMcY6cobqpg1PjS1; Received: from [50.90.253.209] (port=52034 helo=Nathan-GNU) by host393.hostmonster.com with esmtpsa (TLSv1.2:CAMELLIA128-SHA:128) (Exim 4.82) (envelope-from ) id 1WTzUL-0007co-Vq; Sat, 29 Mar 2014 14:04:22 -0600 From: Nathan Trapuzzano Date: Sat, 29 Mar 2014 16:01:10 -0400 References: <87txair0g7.fsf@ivytech.edu> <83fvm2fhii.fsf@gnu.org> <87ob0qrugy.fsf@nbtrap.com> <83y4ztec5l.fsf@gnu.org> <87ob0pnptc.fsf@nbtrap.com> <83d2h5du2e.fsf@gnu.org> <87eh1lcdaj.fsf@nbtrap.com> <838urtdpwk.fsf@gnu.org> <87ioqxxbtk.fsf@nbtrap.com> <831txkewil.fsf@gnu.org> <8761mwua93.fsf@nbtrap.com> <83vbuwdbsf.fsf@gnu.org> Message-ID: <878ursydod.fsf@nbtrap.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Identified-User: {1585:host393.hostmonster.com:nbtrapco:nbtrap.com} {sentby:smtp auth 50.90.253.209 authed with nbtrap@nbtrap.com} X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Eli Zaretskii writes: >> I know. But if Emacs doesn't do Unicode folding, what is there other >> than lower/upper variants? > > You can make it have whatever you like, because you can set up > buffer-specific tables. Makes me wonder if whoever implemented the CANONICALIZE slot had Unicode folding in mind. >> Finding the non-canonical variants is not something that happens (at >> least in principle) during case-insensitive matching. > > The case database is not only for searching. > >> There is no reverse mapping when it comes to folding. There can't be, >> since multiple characters can fold into the same character. > > You can use the case of the string being replaced as guidelines. > E.g., if the replaced string was capitalized, you can capitalize the > replacement. I think you're still conflating case conversion and case folding. As I said, there is no case called "fold". There's just upper, lower, and title. And the fact that these three overlap is already a problem for case-preserving replace. I spent most of my last email trying to explain this. From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: Nathan Trapuzzano Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 29 Mar 2014 20:16:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 17130@debbugs.gnu.org Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.139612415122589 (code B ref 17130); Sat, 29 Mar 2014 20:16:01 +0000 Received: (at 17130) by debbugs.gnu.org; 29 Mar 2014 20:15:51 +0000 Received: from localhost ([127.0.0.1]:56457 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WTzfS-0005sG-J2 for submit@debbugs.gnu.org; Sat, 29 Mar 2014 16:15:51 -0400 Received: from gproxy3-pub.mail.unifiedlayer.com ([69.89.30.42]:45963) by debbugs.gnu.org with smtp (Exim 4.80) (envelope-from ) id 1WTzfP-0005s6-4g for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 16:15:48 -0400 Received: (qmail 27795 invoked by uid 0); 29 Mar 2014 20:15:44 -0000 Received: from unknown (HELO cmgw4) (10.0.90.85) by gproxy3.mail.unifiedlayer.com with SMTP; 29 Mar 2014 20:15:44 -0000 Received: from host393.hostmonster.com ([66.147.240.193]) by cmgw4 with id jfFd1n00H4B3kjm01fFgwy; Sat, 29 Mar 2014 21:15:43 -0600 X-Authority-Analysis: v=2.1 cv=L+eOHYj8 c=1 sm=1 tr=0 a=GZ6qK+eS4AuCRVUKGEKC+Q==:117 a=GZ6qK+eS4AuCRVUKGEKC+Q==:17 a=DsvgjBjRAAAA:8 a=f5113yIGAAAA:8 a=4GsTxW34auoA:10 a=2__L0ovz5gcA:10 a=lfvU_ReahkwA:10 a=ngU5ixn2AAAA:8 a=fWyWhr6xdMwA:10 a=mDV3o1hIAAAA:8 a=kUnNvfyN5qT8Sf2AZVcA:9 a=ii61gXl28gQA:10 a=T1GfT9_SXI0A:10 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nbtrap.com; s=default; h=Content-Type:MIME-Version:Message-ID:In-Reply-To:Date:References:Subject:Cc:To:From; bh=bM+hZHSzSlq9w22KeGxyww68M1ynTz2d3qZvldkTeTA=; b=CqKherdB5zWLeh/B1njvHsi0Q2WofUXqQaCXXIxKKtBfIK+K6GgnRPgNjJvFM1ckKyUNJ0iAQIedFVSaPdbYQCfibNog/sK8CXpULlhvGKWA5c3ze5shaeg1GAC+97xR; Received: from [50.90.253.209] (port=52038 helo=Nathan-GNU) by host393.hostmonster.com with esmtpsa (TLSv1.2:CAMELLIA128-SHA:128) (Exim 4.82) (envelope-from ) id 1WTzfG-00025S-B0; Sat, 29 Mar 2014 14:15:38 -0600 From: Nathan Trapuzzano References: <87txair0g7.fsf@ivytech.edu> <83fvm2fhii.fsf@gnu.org> <87ob0qrugy.fsf@nbtrap.com> <83y4ztec5l.fsf@gnu.org> <87ob0pnptc.fsf@nbtrap.com> <83d2h5du2e.fsf@gnu.org> <87eh1lcdaj.fsf@nbtrap.com> <838urtdpwk.fsf@gnu.org> <87ioqxxbtk.fsf@nbtrap.com> <831txkewil.fsf@gnu.org> <8761mwua93.fsf@nbtrap.com> <87eh1kdf7p.fsf@nbtrap.com> <83txagdbrb.fsf@gnu.org> Date: Sat, 29 Mar 2014 16:15:34 -0400 In-Reply-To: <83txagdbrb.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 29 Mar 2014 22:51:20 +0300") Message-ID: <87siq0sqvt.fsf@nbtrap.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Identified-User: {1585:host393.hostmonster.com:nbtrapco:nbtrap.com} {sentby:smtp auth 50.90.253.209 authed with nbtrap@nbtrap.com} X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Eli Zaretskii writes: >> From: Nathan Trapuzzano >> Cc: 17130@debbugs.gnu.org >> Date: Sat, 29 Mar 2014 14:36:42 -0400 >> >> Er, let me rephrase. Case _conversion_ (such as in case-preserving >> replace) and case _folding_ (such as ought be used in case-insensitive >> searching) are orthogonal. > > But they can very well use the same database. It's not clear what you mean. We already have a place to store upper- and lower- case variants. What I'm proposing is to use the CANONICALIZE slot as a place to store the case-folding mapping. If this would mess up Emacs' case-preserving replace, then I think that would just mean that case-preserving replace is broken. There is no such case as "canonicalize"--you can't say, "Oh, this string is in the canonical case, so when I want to replace it with this other string in canonical case". A case-preserving replace should only consult the upper- and lower-case slots (and perhaps the title-case slot if it existed). From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 30 Mar 2014 02:46:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Nathan Trapuzzano Cc: 17130@debbugs.gnu.org Reply-To: Eli Zaretskii Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.139614754427756 (code B ref 17130); Sun, 30 Mar 2014 02:46:01 +0000 Received: (at 17130) by debbugs.gnu.org; 30 Mar 2014 02:45:44 +0000 Received: from localhost ([127.0.0.1]:56559 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WU5km-0007Dc-7M for submit@debbugs.gnu.org; Sat, 29 Mar 2014 22:45:44 -0400 Received: from mtaout20.012.net.il ([80.179.55.166]:40113) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WU5ki-0007DR-D4 for 17130@debbugs.gnu.org; Sat, 29 Mar 2014 22:45:42 -0400 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0N38000009ES5X00@a-mtaout20.012.net.il> for 17130@debbugs.gnu.org; Sun, 30 Mar 2014 05:45:38 +0300 (IDT) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N38000EXAC2A720@a-mtaout20.012.net.il>; Sun, 30 Mar 2014 05:45:38 +0300 (IDT) Date: Sun, 30 Mar 2014 05:45:39 +0300 From: Eli Zaretskii In-reply-to: <87siq0sqvt.fsf@nbtrap.com> X-012-Sender: halo1@inter.net.il Message-id: <83siq0csks.fsf@gnu.org> References: <87txair0g7.fsf@ivytech.edu> <83fvm2fhii.fsf@gnu.org> <87ob0qrugy.fsf@nbtrap.com> <83y4ztec5l.fsf@gnu.org> <87ob0pnptc.fsf@nbtrap.com> <83d2h5du2e.fsf@gnu.org> <87eh1lcdaj.fsf@nbtrap.com> <838urtdpwk.fsf@gnu.org> <87ioqxxbtk.fsf@nbtrap.com> <831txkewil.fsf@gnu.org> <8761mwua93.fsf@nbtrap.com> <87eh1kdf7p.fsf@nbtrap.com> <83txagdbrb.fsf@gnu.org> <87siq0sqvt.fsf@nbtrap.com> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Nathan Trapuzzano > Cc: 17130@debbugs.gnu.org > Date: Sat, 29 Mar 2014 16:15:34 -0400 > > Eli Zaretskii writes: > > >> From: Nathan Trapuzzano > >> Cc: 17130@debbugs.gnu.org > >> Date: Sat, 29 Mar 2014 14:36:42 -0400 > >> > >> Er, let me rephrase. Case _conversion_ (such as in case-preserving > >> replace) and case _folding_ (such as ought be used in case-insensitive > >> searching) are orthogonal. > > > > But they can very well use the same database. > > It's not clear what you mean. You keep asking questions about the purpose of the CANONICALIZE slot, and I keep trying to explain that purpose. > We already have a place to store upper- and lower- case variants. What > I'm proposing is to use the CANONICALIZE slot as a place to store the > case-folding mapping. If this would mess up Emacs' case-preserving > replace, then I think that would just mean that case-preserving replace > is broken. There is no such case as "canonicalize"--you can't say, "Oh, > this string is in the canonical case, so when I want to replace it with > this other string in canonical case". A case-preserving replace should > only consult the upper- and lower-case slots (and perhaps the title-case > slot if it existed). Perhaps you should tell what does tis mean in practice, from the POV of populating the CANONICALIZE slot, and how that content would be used under your proposal. That should make the discussion more useful, I hope. From debbugs-submit-bounces@debbugs.gnu.org Tue Apr 15 00:00:38 2014 Received: (at control) by debbugs.gnu.org; 15 Apr 2014 04:00:38 +0000 Received: from localhost ([127.0.0.1]:48462 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WZuY0-0003VD-1e for submit@debbugs.gnu.org; Tue, 15 Apr 2014 00:00:36 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:54163) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WZuXw-0003Kn-E5 for control@debbugs.gnu.org; Tue, 15 Apr 2014 00:00:33 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id EB0FF39E8018 for ; Mon, 14 Apr 2014 21:00:26 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YCDaXOyvBXmB for ; Mon, 14 Apr 2014 21:00:18 -0700 (PDT) Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net [108.0.233.62]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 6F8BA39E8014 for ; Mon, 14 Apr 2014 21:00:18 -0700 (PDT) Message-ID: <534CAED2.7090508@cs.ucla.edu> Date: Mon, 14 Apr 2014 21:00:18 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: control@debbugs.gnu.org Subject: 17130 is a wishlist item Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -3.3 (---) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) severity 17130 wishlist thanks From unknown Mon Jun 23 20:19:33 2025 X-Loop: help-debbugs@gnu.org Subject: bug#17130: 24.4.50; Deficient Unicode case folding Resent-From: Lars Ingebrigtsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 29 Sep 2019 14:24:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 17130 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Nathan Trapuzzano Cc: 17130@debbugs.gnu.org Received: via spool by 17130-submit@debbugs.gnu.org id=B17130.156976700925417 (code B ref 17130); Sun, 29 Sep 2019 14:24:01 +0000 Received: (at 17130) by debbugs.gnu.org; 29 Sep 2019 14:23:29 +0000 Received: from localhost ([127.0.0.1]:55389 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1iEa6j-0006bq-79 for submit@debbugs.gnu.org; Sun, 29 Sep 2019 10:23:29 -0400 Received: from quimby.gnus.org ([80.91.231.51]:55812) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1iEa6e-0006be-SE for 17130@debbugs.gnu.org; Sun, 29 Sep 2019 10:23:27 -0400 Received: from cm-84.212.202.86.getinternet.no ([84.212.202.86] helo=marnie) by quimby.gnus.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1iEa6b-0001p3-9b; Sun, 29 Sep 2019 16:23:23 +0200 From: Lars Ingebrigtsen References: <87txair0g7.fsf@ivytech.edu> Date: Sun, 29 Sep 2019 16:23:21 +0200 In-Reply-To: <87txair0g7.fsf@ivytech.edu> (Nathan Trapuzzano's message of "Fri, 28 Mar 2014 08:07:20 -0400") Message-ID: <874l0vi3l2.fsf@gnus.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: Nathan Trapuzzano writes: > M-: (compare-strings "=?UTF-8?Q?=CF=83?=" nil nil "=?UTF-8?Q?=CF=82?=" nil nil t) > > ==> -1 ;; should be t (compare-strings "=?UTF-8?Q?=CF=83?=" nil nil "=?UTF-8?Q?=CF=82?=" nil nil t) => t Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Nathan Trapuzzano writes: > M-: (compare-strings "=CF=83" nil nil "=CF=82" nil nil t) > > =3D=3D> -1 ;; should be t (compare-strings "=CF=83" nil nil "=CF=82" nil nil t) =3D> t I'm unable to reproduce this in Emacs 27, so I'm going to go ahead and guess that this has been fixed in the years since this bug was reported, and I'm closing this bug report. If this is still a problem, please reopen. --=20 (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no From debbugs-submit-bounces@debbugs.gnu.org Sun Sep 29 10:23:32 2019 Received: (at control) by debbugs.gnu.org; 29 Sep 2019 14:23:32 +0000 Received: from localhost ([127.0.0.1]:55392 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1iEa6m-0006c8-GB for submit@debbugs.gnu.org; Sun, 29 Sep 2019 10:23:32 -0400 Received: from quimby.gnus.org ([80.91.231.51]:55826) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1iEa6j-0006bs-QR for control@debbugs.gnu.org; Sun, 29 Sep 2019 10:23:30 -0400 Received: from cm-84.212.202.86.getinternet.no ([84.212.202.86] helo=marnie) by quimby.gnus.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1iEa6h-0001pA-6h for control@debbugs.gnu.org; Sun, 29 Sep 2019 16:23:29 +0200 Date: Sun, 29 Sep 2019 16:23:26 +0200 Message-Id: <8736gfi3kx.fsf@gnus.org> To: control@debbugs.gnu.org From: Lars Ingebrigtsen Subject: control message for bug #17130 X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: close 17130 quit Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) close 17130 quit