From unknown Sat Jun 21 10:31:14 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#16731 <16731@debbugs.gnu.org> To: bug#16731 <16731@debbugs.gnu.org> Subject: Status: 24.3.50; Latin small letter sharp s is not considered lower-case Reply-To: bug#16731 <16731@debbugs.gnu.org> Date: Sat, 21 Jun 2025 17:31:14 +0000 retitle 16731 24.3.50; Latin small letter sharp s is not considered lower-c= ase reassign 16731 emacs submitter 16731 Jorgen Schaefer severity 16731 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 12 12:30:28 2014 Received: (at submit) by debbugs.gnu.org; 12 Feb 2014 17:30:28 +0000 Received: from localhost ([127.0.0.1]:49204 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDddj-0006YH-TL for submit@debbugs.gnu.org; Wed, 12 Feb 2014 12:30:28 -0500 Received: from eggs.gnu.org ([208.118.235.92]:43449) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDddf-0006Xx-Pm for submit@debbugs.gnu.org; Wed, 12 Feb 2014 12:30:24 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WDddS-0006bS-09 for submit@debbugs.gnu.org; Wed, 12 Feb 2014 12:30:18 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:40389) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WDddR-0006bJ-UQ for submit@debbugs.gnu.org; Wed, 12 Feb 2014 12:30:09 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45131) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WDddL-0005Vc-VJ for bug-gnu-emacs@gnu.org; Wed, 12 Feb 2014 12:30:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WDddD-0006KU-JP for bug-gnu-emacs@gnu.org; Wed, 12 Feb 2014 12:30:03 -0500 Received: from loki.jorgenschaefer.de ([87.230.15.51]:47198) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WDddD-0006KH-D7 for bug-gnu-emacs@gnu.org; Wed, 12 Feb 2014 12:29:55 -0500 Received: by loki.jorgenschaefer.de (Postfix, from userid 1000) id AB68B201370; Wed, 12 Feb 2014 18:29:23 +0100 (CET) From: Jorgen Schaefer To: bug-gnu-emacs@gnu.org Subject: 24.3.50; Latin small letter sharp s is not considered lower-case Date: Wed, 12 Feb 2014 18:29:23 +0100 Message-ID: <87wqh08cjw.fsf@loki.jorgenschaefer.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) Hi! The following seems like a bug: (string-match "[[:lower:]]" "=C3=9F") =3D> nil `describe-char' for this says: name: LATIN SMALL LETTER SHARP S general-category: Ll (Letter, Lowercase) decomposition: (223) ('=C3=9F') Not sure why it would not be considered a lower-case letter. Umlauts like =C3=A4, =C3=B6 and =C3=BC are matched correctly. Regards, -- Jorgen Configured using: `configure --without-x' Important settings: value of $LC_ALL:=20 value of $LC_COLLATE: de_DE.UTF-8 value of $LC_CTYPE: de_DE.UTF-8 value of $LC_MESSAGES: POSIX value of $LC_MONETARY: POSIX value of $LC_NUMERIC: POSIX value of $LC_TIME: POSIX value of $LANG: POSIX locale-coding-system: utf-8-unix From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 12 12:55:31 2014 Received: (at 16731) by debbugs.gnu.org; 12 Feb 2014 17:55:31 +0000 Received: from localhost ([127.0.0.1]:49214 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDe1x-0007Dm-WC for submit@debbugs.gnu.org; Wed, 12 Feb 2014 12:55:30 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:47185 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDe1s-0007DY-Ex for 16731@debbugs.gnu.org; Wed, 12 Feb 2014 12:55:25 -0500 Received: from rgm by fencepost.gnu.org with local (Exim 4.71) (envelope-from ) id 1WDe1q-0008TS-Le; Wed, 12 Feb 2014 12:55:22 -0500 From: Glenn Morris To: Jorgen Schaefer Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> X-Spook: AMEMB weapons of mass destruction benelux domestic X-Ran: xk/5\n!xUvMyUAY X-Hue: red X-Debbugs-No-Ack: yes X-Attribution: GM Date: Wed, 12 Feb 2014 12:55:22 -0500 In-Reply-To: <87wqh08cjw.fsf@loki.jorgenschaefer.de> (Jorgen Schaefer's message of "Wed, 12 Feb 2014 18:29:23 +0100") Message-ID: <57txc4cj1x.fsf@fencepost.gnu.org> User-Agent: Gnus (www.gnus.org), GNU Emacs (www.gnu.org/software/emacs/) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.7 (-----) X-Debbugs-Envelope-To: 16731 Cc: 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.7 (-----) Jorgen Schaefer wrote: > Not sure why it would not be considered a lower-case letter. Umlauts > like =C3=A4, =C3=B6 and =C3=BC are matched correctly. See http://debbugs.gnu.org/10576 (I have no idea whether this is an Emacs bug or not.) From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 12 12:55:41 2014 Received: (at control) by debbugs.gnu.org; 12 Feb 2014 17:55:41 +0000 Received: from localhost ([127.0.0.1]:49217 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDe28-0007EF-HC for submit@debbugs.gnu.org; Wed, 12 Feb 2014 12:55:40 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:47191 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDe26-0007E7-Jm for control@debbugs.gnu.org; Wed, 12 Feb 2014 12:55:38 -0500 Received: from rgm by fencepost.gnu.org with local (Exim 4.71) (envelope-from ) id 1WDe26-00009h-Ax for control@debbugs.gnu.org; Wed, 12 Feb 2014 12:55:38 -0500 Date: Wed, 12 Feb 2014 12:55:38 -0500 Message-Id: Subject: control message for bug 16731 To: X-Mailer: mail (GNU Mailutils 2.1) From: Glenn Morris X-Spam-Score: -5.7 (-----) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.7 (-----) merge 10576 16731 From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 12 14:27:42 2014 Received: (at submit) by debbugs.gnu.org; 12 Feb 2014 19:27:42 +0000 Received: from localhost ([127.0.0.1]:49237 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDfTB-0002OD-M0 for submit@debbugs.gnu.org; Wed, 12 Feb 2014 14:27:41 -0500 Received: from eggs.gnu.org ([208.118.235.92]:38316) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDfT8-0002Np-QG for submit@debbugs.gnu.org; Wed, 12 Feb 2014 14:27:39 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WDfSw-00085c-UY for submit@debbugs.gnu.org; Wed, 12 Feb 2014 14:27:33 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:55532) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WDfSw-00085X-SN for submit@debbugs.gnu.org; Wed, 12 Feb 2014 14:27:26 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39980) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WDfSr-0004Tj-1Z for bug-gnu-emacs@gnu.org; Wed, 12 Feb 2014 14:27:26 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WDfSl-00083u-9J for bug-gnu-emacs@gnu.org; Wed, 12 Feb 2014 14:27:21 -0500 Received: from moutng.kundenserver.de ([212.227.126.187]:55089) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WDfSl-00083f-0A for bug-gnu-emacs@gnu.org; Wed, 12 Feb 2014 14:27:15 -0500 Received: from purzel.sitgens (brln-4d0c477a.pool.mediaWays.net [77.12.71.122]) by mrelayeu.kundenserver.de (node=mreue001) with ESMTP (Nemesis) id 0MWsiR-1Vfofj3fat-00XpUV; Wed, 12 Feb 2014 20:27:13 +0100 Message-ID: <52FBCC08.50509@easy-emacs.de> Date: Wed, 12 Feb 2014 20:31:20 +0100 From: =?UTF-8?B?QW5kcmVhcyBSw7ZobGVy?= User-Agent: Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: bug-gnu-emacs@gnu.org Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> In-Reply-To: <57txc4cj1x.fsf@fencepost.gnu.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Provags-ID: V02:K0:dzuQX4FQUT7vtQ7gJojCzYpO6gMm4pqiUBG5iTbIQD5 ihQyd0jscV/3dXBotLnGhTpcHSRnYSlm9jNZFtWjLJWG3JFmHh FmNCB9wGSgC68hQaFCD/rM/MU+uOTh3iJ6TP48Wa0ed5Lq0k1z SPJOzANwSd49GfPXxAfohDLpsuu45vEZ4AdaTTcSCaTa7AQDXG 7oND5MtLY56IxGmBI6XnGzY+jwypCGthZUfxWnlwQJJlMswn0Q Kj2p0pwCsydhtHTEuzZk7PN8+pxwgZTyG4JZ8b/v+Sutb9oB2E J5ToYihFUJz7gxshz6X2m5IG5c4gG0gE1SQ1YPthwvR62vbO1Y avUODbbRQaCtmgi51jVY= X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) Am 12.02.2014 18:55, schrieb Glenn Morris: > Jorgen Schaefer wrote: > >> Not sure why it would not be considered a lower-case letter. Umlauts >> like ä, ö and ü are matched correctly. > > See http://debbugs.gnu.org/10576 > > (I have no idea whether this is an Emacs bug or not.) > > > > IMO the answer given at link is not valid. Indeed the implementation in buffer.h does check --&& upcase1 (c)-- and expects a result, i.e. ignores the fact, some characters might not have an upcase variant. When seeing there is a downcase-table, the check probably should be done against this. From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 12 14:49:45 2014 Received: (at 16731) by debbugs.gnu.org; 12 Feb 2014 19:49:45 +0000 Received: from localhost ([127.0.0.1]:49243 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDfoW-00030U-V8 for submit@debbugs.gnu.org; Wed, 12 Feb 2014 14:49:45 -0500 Received: from mtaout23.012.net.il ([80.179.55.175]:35366) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDfoT-00030A-1C for 16731@debbugs.gnu.org; Wed, 12 Feb 2014 14:49:42 -0500 Received: from conversion-daemon.a-mtaout23.012.net.il by a-mtaout23.012.net.il (HyperSendmail v2007.08) id <0N0W00E00EOSVF00@a-mtaout23.012.net.il> for 16731@debbugs.gnu.org; Wed, 12 Feb 2014 21:49:34 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout23.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N0W00ELUF2LSF70@a-mtaout23.012.net.il>; Wed, 12 Feb 2014 21:49:34 +0200 (IST) Date: Wed, 12 Feb 2014 21:49:25 +0200 From: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case In-reply-to: <52FBCC08.50509@easy-emacs.de> X-012-Sender: halo1@inter.net.il To: Andreas =?utf-8?Q?R=C3=B6hler?= Message-id: <83fvnoru0q.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: 8BIT References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16731 Cc: 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Wed, 12 Feb 2014 20:31:20 +0100 > From: Andreas Röhler > > > See http://debbugs.gnu.org/10576 > > > > (I have no idea whether this is an Emacs bug or not.) > > > > IMO the answer given at link is not valid. It accurately describes what happens in the code, so it's definitely valid. > When seeing there is a downcase-table, the check probably should be done against this. Not sure what you mean by that, please elaborate. From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 12 15:07:00 2014 Received: (at 16731) by debbugs.gnu.org; 12 Feb 2014 20:07:00 +0000 Received: from localhost ([127.0.0.1]:49262 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDg5D-0003Rz-RF for submit@debbugs.gnu.org; Wed, 12 Feb 2014 15:07:00 -0500 Received: from moutng.kundenserver.de ([212.227.17.10]:63018) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDg5C-0003Rh-2A for 16731@debbugs.gnu.org; Wed, 12 Feb 2014 15:06:58 -0500 Received: from purzel.sitgens (brln-4d0c477a.pool.mediaWays.net [77.12.71.122]) by mrelayeu.kundenserver.de (node=mreue003) with ESMTP (Nemesis) id 0MYJ7N-1ViYPb06UL-00V5Cs; Wed, 12 Feb 2014 21:06:50 +0100 Message-ID: <52FBD551.2000808@easy-emacs.de> Date: Wed, 12 Feb 2014 21:10:57 +0100 From: =?UTF-8?B?QW5kcmVhcyBSw7ZobGVy?= User-Agent: Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> In-Reply-To: <83fvnoru0q.fsf@gnu.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Provags-ID: V02:K0:IohOl6YEdlo5xzrpt+AWxS2WGGDawnlrG7BRiR5uANN WX/y1/m4IoR1VryNc1LF7Vy05DqatdlnOj5kO7wryr4NvRKcsB OK6m8a9F9DwF6vvcXedTLcrYueXNtOvMoWG+o6BhhM2lMAlzur jG8SGrU7M16TYsgSqb8BRiD3t4QHCToCGIxZkvZ1Zah5nkqiOc FlMhbWNDjmnPCrW3PJU92Hu2/uXGRvniofG/Zjfl/+/ibn7DA2 t/U0PydBo3ihfzTEJWtNxcVoCih2mMTuyT7vV0s9XyJVo1JAKN AndsxOUGQTpLH2HfA0MS7I8adBh7/ktwvOY2HPeu7DWBggkvoW vSEM6EWcICCtD7IUMAdhuVonyx8L61ZK84dPGW38q X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 16731 Cc: 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Am 12.02.2014 20:49, schrieb Eli Zaretskii: >> Date: Wed, 12 Feb 2014 20:31:20 +0100 >> From: Andreas Röhler >> >>> See http://debbugs.gnu.org/10576 >>> >>> (I have no idea whether this is an Emacs bug or not.) >>> >> >> IMO the answer given at link is not valid. > > It accurately describes what happens in the code, so it's definitely > valid. > >> When seeing there is a downcase-table, the check probably should be done against this. > > Not sure what you mean by that, please elaborate. > > See buffer.h IIUC the mentioned lowercasep is implemented as !uppercasep (c) && upcase1 (c) != c; upcase1 (c) must fail, as there is no upcased of this char. While upcase1 can't succeed, downcase should - if "ß" is a member of downcase_table. From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 12 15:16:58 2014 Received: (at 16731) by debbugs.gnu.org; 12 Feb 2014 20:16:58 +0000 Received: from localhost ([127.0.0.1]:49286 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDgEr-0004sO-O4 for submit@debbugs.gnu.org; Wed, 12 Feb 2014 15:16:58 -0500 Received: from mtaout27.012.net.il ([80.179.55.183]:44300) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDgEp-0004s8-1W for 16731@debbugs.gnu.org; Wed, 12 Feb 2014 15:16:56 -0500 Received: from conversion-daemon.mtaout27.012.net.il by mtaout27.012.net.il (HyperSendmail v2007.08) id <0N0W00H00FP9A500@mtaout27.012.net.il> for 16731@debbugs.gnu.org; Wed, 12 Feb 2014 22:15:24 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout27.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N0W00EV4G9OLA40@mtaout27.012.net.il>; Wed, 12 Feb 2014 22:15:24 +0200 (IST) Date: Wed, 12 Feb 2014 22:16:40 +0200 From: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case In-reply-to: <52FBD551.2000808@easy-emacs.de> X-012-Sender: halo1@inter.net.il To: Andreas =?utf-8?Q?R=C3=B6hler?= Message-id: <83bnycrsrb.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: 8BIT References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16731 Cc: 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Wed, 12 Feb 2014 21:10:57 +0100 > From: Andreas Röhler > CC: 16731@debbugs.gnu.org > > While upcase1 can't succeed, downcase should - if "ß" is a member of downcase_table. But which character do you want to downcase in this case? This whole logic works only for _pairs_ of characters (and the char-table used here is populated by calls to set-case-syntax-pair). Such machinery cannot possibly work when there's no pair. The only way I can see out of this conundrum is to consult the Lowercase Unicode property of the character as fallback, assuming that won't slow down regex search too much. From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 12 15:29:38 2014 Received: (at 16731) by debbugs.gnu.org; 12 Feb 2014 20:29:38 +0000 Received: from localhost ([127.0.0.1]:49290 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDgR4-0005C4-4Z for submit@debbugs.gnu.org; Wed, 12 Feb 2014 15:29:37 -0500 Received: from moutng.kundenserver.de ([212.227.126.171]:52336) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDgR2-0005Bp-EX for 16731@debbugs.gnu.org; Wed, 12 Feb 2014 15:29:33 -0500 Received: from purzel.sitgens (brln-4d0c477a.pool.mediaWays.net [77.12.71.122]) by mrelayeu.kundenserver.de (node=mreue005) with ESMTP (Nemesis) id 0M5Ksl-1VGKr51bx9-00zaZ0; Wed, 12 Feb 2014 21:29:24 +0100 Message-ID: <52FBDA9B.102@easy-emacs.de> Date: Wed, 12 Feb 2014 21:33:31 +0100 From: =?UTF-8?B?QW5kcmVhcyBSw7ZobGVy?= User-Agent: Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> In-Reply-To: <83bnycrsrb.fsf@gnu.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Provags-ID: V02:K0:dxVbtFRKCyxUVumFTYOKuRHbhBurmNAqmOwqQv54glN 2oocLTdOF7tYeUil+a0lcqoaZK09uG0yf9lAL5R7iX8C9sqvWi m05jJ9oY3EvjJeqhihBKAZkhwMWbrVUmdbCejwyVMc9qxO3Fd6 iF5AvNrD75Gx9UZ4jzgBXEFoTVx27sjZQEwUgju6P6vmnOMN3U +AmyYEAds0VZgpo4Opj+Cu5FBlWmOdzx7DxJHbDaWnTxD12MAt x4gP5msqQAl8oKaf6kaemoVwFLk/wskh8lAugVFhHLGI+Ne3Vt HAUuXg65oNWteFyYDNmQ/wlYhD6ncnwURVf09oKxDRz+1psoYl hLvh5jLXCgfLXbZIFNM4= X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 16731 Cc: 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Am 12.02.2014 21:16, schrieb Eli Zaretskii: >> Date: Wed, 12 Feb 2014 21:10:57 +0100 >> From: Andreas Röhler >> CC: 16731@debbugs.gnu.org >> >> While upcase1 can't succeed, downcase should - if "ß" is a member of downcase_table. > > But which character do you want to downcase in this case? > > This whole logic works only for _pairs_ of characters (and the > char-table used here is populated by calls to set-case-syntax-pair). So populate it differently, resp. allow empty slots. > Such machinery cannot possibly work when there's no pair. > > The only way I can see out of this conundrum is to consult the > Lowercase Unicode property of the character as fallback, assuming that > won't slow down regex search too much. > > You can do (downcase "d") for example, which results in "d". Instead of upcase1 (c) != c what about downcase (c) == c ? From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 12 15:58:13 2014 Received: (at 16731) by debbugs.gnu.org; 12 Feb 2014 20:58:13 +0000 Received: from localhost ([127.0.0.1]:49308 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDgsm-0005x4-V8 for submit@debbugs.gnu.org; Wed, 12 Feb 2014 15:58:13 -0500 Received: from mail-yh0-f44.google.com ([209.85.213.44]:64380) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDgsl-0005wp-63 for 16731@debbugs.gnu.org; Wed, 12 Feb 2014 15:58:11 -0500 Received: by mail-yh0-f44.google.com with SMTP id f73so9158230yha.3 for <16731@debbugs.gnu.org>; Wed, 12 Feb 2014 12:58:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=8B5slyNZOTpG52DHK2X+wvsqDBZsbGzWvdN0CV11wpQ=; b=DLuG76Def0u5YtQHjWRRuIm9zkb1WjpDpbVZkO7so6ebh0ee5NM+7Nsxl0DqLy2aeT r8tv/rRzzxZ0r2ANqVKXrLDNC7zPLeN1fXMGtmPaRnyrcmpKFJUQ1+QFx9Pvk4kJohRL lTouqBa7JR2oU1D+QetuCbk/vaag2JXzrl/Cl8butJdVD/ho66zfbApi26uH7zWM9SF3 t1HHGMVHVYaTbJn2cG+wLFIMiapJQaD3/ime3hzq9o+l+KOD/ctofO0upM1jUrnkVM2H nqwHODlx8HvcehyGDrb0K1TuVPnHfqf15wzH4N0ZU2ROPldikSId3MFi4UKfDW1P9hNf Lirw== X-Received: by 10.236.20.75 with SMTP id o51mr4331599yho.65.1392238683879; Wed, 12 Feb 2014 12:58:03 -0800 (PST) MIME-Version: 1.0 Received: by 10.170.84.65 with HTTP; Wed, 12 Feb 2014 12:57:23 -0800 (PST) In-Reply-To: <52FBDA9B.102@easy-emacs.de> References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> From: Juanma Barranquero Date: Wed, 12 Feb 2014 21:57:23 +0100 Message-ID: Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case To: =?UTF-8?Q?Andreas_R=C3=B6hler?= Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 16731 Cc: Eli Zaretskii , 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Wed, Feb 12, 2014 at 9:33 PM, Andreas R=C3=B6hler wrote: > what about > > downcase (c) =3D=3D c Won't that be true for characters that have no upcase/downcase difference, like digits? J From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 12 22:46:41 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 03:46:42 +0000 Received: from localhost ([127.0.0.1]:49798 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDnG5-0003i1-G3 for submit@debbugs.gnu.org; Wed, 12 Feb 2014 22:46:41 -0500 Received: from mtaout23.012.net.il ([80.179.55.175]:45346) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDnG1-0003hj-I0 for 16731@debbugs.gnu.org; Wed, 12 Feb 2014 22:46:39 -0500 Received: from conversion-daemon.a-mtaout23.012.net.il by a-mtaout23.012.net.il (HyperSendmail v2007.08) id <0N0X00G000LD2X00@a-mtaout23.012.net.il> for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 05:46:30 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout23.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N0X00GB915H0O40@a-mtaout23.012.net.il>; Thu, 13 Feb 2014 05:46:30 +0200 (IST) Date: Thu, 13 Feb 2014 05:46:22 +0200 From: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case In-reply-to: <52FBDA9B.102@easy-emacs.de> X-012-Sender: halo1@inter.net.il To: Andreas =?utf-8?Q?R=C3=B6hler?= Message-id: <83a9dvsmi9.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: 8BIT References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16731 Cc: 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Wed, 12 Feb 2014 21:33:31 +0100 > From: Andreas Röhler > CC: 16731@debbugs.gnu.org > > Am 12.02.2014 21:16, schrieb Eli Zaretskii: > >> Date: Wed, 12 Feb 2014 21:10:57 +0100 > >> From: Andreas Röhler > >> CC: 16731@debbugs.gnu.org > >> > >> While upcase1 can't succeed, downcase should - if "ß" is a member of downcase_table. > > > > But which character do you want to downcase in this case? > > > > This whole logic works only for _pairs_ of characters (and the > > char-table used here is populated by calls to set-case-syntax-pair). > > So populate it differently, resp. allow empty slots. How will we then be able to distinguish between lower-case characters that have no upcase variant and characters that are not lower-case characters at all? > You can do (downcase "d") for example, which results in "d". > > Instead of > > upcase1 (c) != c > > what about > > downcase (c) == c > > ? The same is true for any non-letter, like punctuation. From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 03:23:50 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 08:23:50 +0000 Received: from localhost ([127.0.0.1]:49968 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDraH-0002ls-F7 for submit@debbugs.gnu.org; Thu, 13 Feb 2014 03:23:49 -0500 Received: from moutng.kundenserver.de ([212.227.126.171]:56398) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDraE-0002lX-Jt for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 03:23:47 -0500 Received: from purzel.sitgens (brln-4dba0180.pool.mediaWays.net [77.186.1.128]) by mrelayeu.kundenserver.de (node=mreue003) with ESMTP (Nemesis) id 0M2HgG-1VLu3D0oGS-00s5Rg; Thu, 13 Feb 2014 09:23:39 +0100 Message-ID: <52FC81FF.5090202@easy-emacs.de> Date: Thu, 13 Feb 2014 09:27:43 +0100 From: =?UTF-8?B?QW5kcmVhcyBSw7ZobGVy?= User-Agent: Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> In-Reply-To: <83a9dvsmi9.fsf@gnu.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Provags-ID: V02:K0:Dw8IdIEQWkH7vIe4gDwCXXZnLHq8cIrmw2wQM3uWsyI 5OlFYEelyicVdxHrhrOtEbncpR6LCDp1vFGSAxvXD8/9Y2yWTj QvPynEQG2Z8IRARiXayG9A/NU1tUypwiutYil501TUFIAQFKEZ ITpDgXRlJN5nNaGhEJbpvGj1faliZSzEQ8KjmLbZ4s+/cWVTj3 kpQwq0d1ePrqFNTI05hXGt6Syf8YamDxuDx9jqk6cgnBw2FWs/ ysfmsTbOMGpb8t5WZSBvF/5YHxILK0k8G0qMC7Bc4p2Si+Yvrs xebyU1eu8nn6g8yzkj67q4ULznrfsAdbqO//9yMyvakkldPIS2 5GXzRLITUYoqsrtLMR/CKykYm5GQF4xFJna21jC+n X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 16731 Cc: Juanma Barranquero , 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Am 13.02.2014 04:46, schrieb Eli Zaretskii: >> Date: Wed, 12 Feb 2014 21:33:31 +0100 >> From: Andreas Röhler >> CC: 16731@debbugs.gnu.org >> >> Am 12.02.2014 21:16, schrieb Eli Zaretskii: >>>> Date: Wed, 12 Feb 2014 21:10:57 +0100 >>>> From: Andreas Röhler >>>> CC: 16731@debbugs.gnu.org >>>> >>>> While upcase1 can't succeed, downcase should - if "ß" is a member of downcase_table. >>> >>> But which character do you want to downcase in this case? >>> >>> This whole logic works only for _pairs_ of characters (and the >>> char-table used here is populated by calls to set-case-syntax-pair). >> >> So populate it differently, resp. allow empty slots. > > How will we then be able to distinguish between lower-case characters > that have no upcase variant and characters that are not lower-case > characters at all? > >> You can do (downcase "d") for example, which results in "d". >> >> Instead of >> >> upcase1 (c) != c >> >> what about >> >> downcase (c) == c >> >> ? > > The same is true for any non-letter, like punctuation. > > Okay, right. So it seems upcase_table is populated wrongly with "ß"? From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 08:37:57 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 13:37:57 +0000 Received: from localhost ([127.0.0.1]:50149 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDwUG-00042K-5Y for submit@debbugs.gnu.org; Thu, 13 Feb 2014 08:37:56 -0500 Received: from ironport2-out.teksavvy.com ([206.248.154.181]:49172) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDwUB-00041t-FR for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 08:37:52 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AlIJABK/CFFLd+mu/2dsb2JhbABEuzWCVgQEexdzgh4BAQQBViMFCws0EhQYDSSIHgbBLZEKA4hhnBmBXoMV X-IPAS-Result: AlIJABK/CFFLd+mu/2dsb2JhbABEuzWCVgQEexdzgh4BAQQBViMFCws0EhQYDSSIHgbBLZEKA4hhnBmBXoMV X-IronPort-AV: E=Sophos;i="4.84,565,1355115600"; d="scan'208";a="47524646" Received: from 75-119-233-174.dsl.teksavvy.com (HELO pastel.home) ([75.119.233.174]) by ironport2-out.teksavvy.com with ESMTP/TLS/ADH-AES256-SHA; 13 Feb 2014 08:37:45 -0500 Received: by pastel.home (Postfix, from userid 20848) id 959ED60079; Thu, 13 Feb 2014 08:37:45 -0500 (EST) From: Stefan Monnier To: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case Message-ID: References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> Date: Thu, 13 Feb 2014 08:37:45 -0500 In-Reply-To: <83a9dvsmi9.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 13 Feb 2014 05:46:22 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 16731 Cc: Andreas =?windows-1252?Q?R=F6hler?= , 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.3 (/) > How will we then be able to distinguish between lower-case characters > that have no upcase variant and characters that are not lower-case > characters at all? Right: to handle this, we need to distinguish characters that are lower-case without an uppercase variant from characters which are neither lowercase nor uppercase. We could do that by saying that the upcase table should return nil or -1 for =DF, to indicate that the upcase version is "missing". But such a change will probably require carefully revising "all" the code that uses those tables. Stefan From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 10:53:25 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 15:53:26 +0000 Received: from localhost ([127.0.0.1]:50914 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDybN-0007qz-HG for submit@debbugs.gnu.org; Thu, 13 Feb 2014 10:53:25 -0500 Received: from mtaout29.012.net.il ([80.179.55.185]:46547) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDybJ-0007qb-HW for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 10:53:22 -0500 Received: from conversion-daemon.mtaout29.012.net.il by mtaout29.012.net.il (HyperSendmail v2007.08) id <0N0X00700YN96A00@mtaout29.012.net.il> for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 17:55:29 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout29.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N0X00N2OYWH0BA0@mtaout29.012.net.il>; Thu, 13 Feb 2014 17:55:29 +0200 (IST) Date: Thu, 13 Feb 2014 17:53:08 +0200 From: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case In-reply-to: <52FC81FF.5090202@easy-emacs.de> X-012-Sender: halo1@inter.net.il To: Andreas =?utf-8?Q?R=C3=B6hler?= Message-id: <837g8zrouz.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: 8BIT References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <52FC81FF.5090202@easy-emacs.de> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16731 Cc: lekktu@gmail.com, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Thu, 13 Feb 2014 09:27:43 +0100 > From: Andreas Röhler > CC: 16731@debbugs.gnu.org, Juanma Barranquero > > So it seems upcase_table is populated wrongly with "ß"? I see nothing wrong with it: its entry is the character itself, like any other character that has no up-case variant. From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 11:33:23 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 16:33:23 +0000 Received: from localhost ([127.0.0.1]:50985 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDzE1-0000ej-7b for submit@debbugs.gnu.org; Thu, 13 Feb 2014 11:33:22 -0500 Received: from mtaout20.012.net.il ([80.179.55.166]:50106) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDzDy-0000eO-As for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 11:33:19 -0500 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0N0Y00B000FNIQ00@a-mtaout20.012.net.il> for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 18:33:11 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N0Y00BKT0NB9X60@a-mtaout20.012.net.il>; Thu, 13 Feb 2014 18:33:11 +0200 (IST) Date: Thu, 13 Feb 2014 18:33:05 +0200 From: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case In-reply-to: X-012-Sender: halo1@inter.net.il To: Stefan Monnier Message-id: <83y51fq8fy.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=iso-8859-1 Content-transfer-encoding: 8BIT References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16731 Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Stefan Monnier > Cc: Andreas Röhler , > 16731@debbugs.gnu.org > Date: Thu, 13 Feb 2014 08:37:45 -0500 > > > How will we then be able to distinguish between lower-case characters > > that have no upcase variant and characters that are not lower-case > > characters at all? > > Right: to handle this, we need to distinguish characters that are > lower-case without an uppercase variant from characters which are > neither lowercase nor uppercase. > > We could do that by saying that the upcase table should return nil or -1 > for ß, to indicate that the upcase version is "missing". But such > a change will probably require carefully revising "all" the code that > uses those tables. Right. I can instead suggest a much less intrusive change below. Its only disadvantage is that if some user or Lisp program overrides the standard case tables, and actually _wants_ some lower-case characters behave as if they weren't, looking at the Unicode tables will undo such customizations. If this is a concern, perhaps we could compare the case table with the standard value, and only use the Unicode attributes when they are equal? If the approach below is accepted, a related question is how to treat letters whose category is Lt, i.e. "titlecase" -- do we consider such letters upper case or don't we? --- src/buffer.h~0 2014-01-01 09:46:07.000000000 +0200 +++ src/buffer.h 2014-02-13 18:27:32.225839000 +0200 @@ -1349,7 +1349,19 @@ downcase (int c) } /* True if C is upper case. */ -INLINE bool uppercasep (int c) { return downcase (c) != c; } +INLINE bool uppercasep (int c) +{ + Lisp_Object val; + + if (downcase (c) != c) + return true; + + if (NILP (Vunicode_category_table)) + return false; + + val = CHAR_TABLE_REF (Vunicode_category_table, c); + return INTEGERP (val) && XINT (val) == UNICODE_CATEGORY_Lu; +} /* Upcase a character C known to be not upper case. */ INLINE int @@ -1364,7 +1376,16 @@ upcase1 (int c) INLINE bool lowercasep (int c) { - return !uppercasep (c) && upcase1 (c) != c; + Lisp_Object val; + + if (!uppercasep (c) && upcase1 (c) != c) + return true; + + if (NILP (Vunicode_category_table)) + return false; + + val = CHAR_TABLE_REF (Vunicode_category_table, c); + return INTEGERP (val) && XINT (val) == UNICODE_CATEGORY_Ll; } /* Upcase a character C, or make no change if that cannot be done. */ From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 12:11:00 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 17:11:00 +0000 Received: from localhost ([127.0.0.1]:51062 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDzoR-0002zP-EV for submit@debbugs.gnu.org; Thu, 13 Feb 2014 12:10:59 -0500 Received: from ironport2-out.teksavvy.com ([206.248.154.181]:44252) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WDzoO-0002z5-7p for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 12:10:56 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Av8EABK/CFFLd+mu/2dsb2JhbABEuzWDWRdzgh8BBVYjEAs0EhQYDSSIJMEtkQoDiGGcGYFegxU X-IPAS-Result: Av8EABK/CFFLd+mu/2dsb2JhbABEuzWDWRdzgh8BBVYjEAs0EhQYDSSIJMEtkQoDiGGcGYFegxU X-IronPort-AV: E=Sophos;i="4.84,565,1355115600"; d="scan'208";a="47560352" Received: from 75-119-233-174.dsl.teksavvy.com (HELO pastel.home) ([75.119.233.174]) by ironport2-out.teksavvy.com with ESMTP/TLS/ADH-AES256-SHA; 13 Feb 2014 12:10:50 -0500 Received: by pastel.home (Postfix, from userid 20848) id EDEEA6007C; Thu, 13 Feb 2014 12:10:49 -0500 (EST) From: Stefan Monnier To: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case Message-ID: References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> Date: Thu, 13 Feb 2014 12:10:49 -0500 In-Reply-To: <83y51fq8fy.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 13 Feb 2014 18:33:05 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 16731 Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.3 (/) > /* True if C is upper case. */ > -INLINE bool uppercasep (int c) { return downcase (c) != c; } > +INLINE bool uppercasep (int c) > +{ > + Lisp_Object val; > + > + if (downcase (c) != c) > + return true; > + > + if (NILP (Vunicode_category_table)) > + return false; > + > + val = CHAR_TABLE_REF (Vunicode_category_table, c); > + return INTEGERP (val) && XINT (val) == UNICODE_CATEGORY_Lu; > +} Doesn't sound too bad. But it does beg the question: why check (downcase (c) != c) at all, then? Stefan From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 12:39:23 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 17:39:24 +0000 Received: from localhost ([127.0.0.1]:51080 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE0Fv-00069C-85 for submit@debbugs.gnu.org; Thu, 13 Feb 2014 12:39:23 -0500 Received: from mtaout24.012.net.il ([80.179.55.180]:34455) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE0Fp-00068o-K0 for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 12:39:18 -0500 Received: from conversion-daemon.mtaout24.012.net.il by mtaout24.012.net.il (HyperSendmail v2007.08) id <0N0Y00F003DYNI00@mtaout24.012.net.il> for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 19:38:10 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout24.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N0Y0082G3NMZW80@mtaout24.012.net.il>; Thu, 13 Feb 2014 19:38:10 +0200 (IST) Date: Thu, 13 Feb 2014 19:39:04 +0200 From: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case In-reply-to: X-012-Sender: halo1@inter.net.il To: Stefan Monnier Message-id: <83r476rjyf.fsf@gnu.org> References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16731 Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Stefan Monnier > Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org > Date: Thu, 13 Feb 2014 12:10:49 -0500 > > > /* True if C is upper case. */ > > -INLINE bool uppercasep (int c) { return downcase (c) != c; } > > +INLINE bool uppercasep (int c) > > +{ > > + Lisp_Object val; > > + > > + if (downcase (c) != c) > > + return true; > > + > > + if (NILP (Vunicode_category_table)) > > + return false; > > + > > + val = CHAR_TABLE_REF (Vunicode_category_table, c); > > + return INTEGERP (val) && XINT (val) == UNICODE_CATEGORY_Lu; > > +} > > Doesn't sound too bad. But it does beg the question: why check > (downcase (c) != c) at all, then? Because it's faster, and for most characters will do the job. From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 12:58:14 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 17:58:14 +0000 Received: from localhost ([127.0.0.1]:51092 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE0Y9-0006gD-MU for submit@debbugs.gnu.org; Thu, 13 Feb 2014 12:58:14 -0500 Received: from moutng.kundenserver.de ([212.227.17.10]:50077) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE0Y6-0006ft-9T for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 12:58:11 -0500 Received: from purzel.sitgens (brln-4dba0180.pool.mediaWays.net [77.186.1.128]) by mrelayeu.kundenserver.de (node=mreue006) with ESMTP (Nemesis) id 0MHQzX-1W18Eu3o27-00E3xZ; Thu, 13 Feb 2014 18:58:00 +0100 Message-ID: <52FD08A0.1070300@easy-emacs.de> Date: Thu, 13 Feb 2014 19:02:08 +0100 From: =?ISO-8859-15?Q?Andreas_R=F6hler?= User-Agent: Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Eli Zaretskii , Stefan Monnier Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> <83r476rjyf.fsf@gnu.org> In-Reply-To: <83r476rjyf.fsf@gnu.org> Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 8bit X-Provags-ID: V02:K0:7Kbc9nqBeiyoTzmgHCjCDp/3s7fCBccXuj5XejP81ib QDrYjTEI9SI0u04WFVREBjRrYkVceC56FLnYXSEkz4rGM5738z NuMe9NZqdUz43JdxG8Q4gkhGn014zsX+fPA4ASo1+cowsG38S/ DMEnzO6h09tdPtcy26c3GDqQEibVZxwNEuhPxg6kiANln+ECPE c+mDrmchdNuFfGW+0GQAMdtwQ3uw2wfWaE8k3cQnDEgiCKjPJH 3ZBymNcDMI9x+sxq0KKoVvL412aRCfuH+9PvJKZVo8DhJnvOhG +auyQtwPgAeWR4CzRBCeiH3z3Y03btQjHMzl7jqbsoVAl6BSgt P5FiiYj/9l4Uw32Q0/ckkkKXDsFbdLYxsGIqF71fQ X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 16731 Cc: 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Am 13.02.2014 18:39, schrieb Eli Zaretskii: >> From: Stefan Monnier >> Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org >> Date: Thu, 13 Feb 2014 12:10:49 -0500 >> >>> /* True if C is upper case. */ >>> -INLINE bool uppercasep (int c) { return downcase (c) != c; } >>> +INLINE bool uppercasep (int c) >>> +{ >>> + Lisp_Object val; >>> + >>> + if (downcase (c) != c) >>> + return true; >>> + >>> + if (NILP (Vunicode_category_table)) >>> + return false; >>> + >>> + val = CHAR_TABLE_REF (Vunicode_category_table, c); >>> + return INTEGERP (val) && XINT (val) == UNICODE_CATEGORY_Lu; >>> +} >> >> Doesn't sound too bad. But it does beg the question: why check >> (downcase (c) != c) at all, then? > > Because it's faster, and for most characters will do the job. > Maybe I'm missing the point: all change needed is not to store "ß" into the uppercase-table. Why not store nil there instead? From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 12:58:53 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 17:58:53 +0000 Received: from localhost ([127.0.0.1]:51095 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE0Ym-0006hL-Ma for submit@debbugs.gnu.org; Thu, 13 Feb 2014 12:58:53 -0500 Received: from mail-yk0-f172.google.com ([209.85.160.172]:58895) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE0Yk-0006h2-AP for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 12:58:50 -0500 Received: by mail-yk0-f172.google.com with SMTP id 200so20066359ykr.3 for <16731@debbugs.gnu.org>; Thu, 13 Feb 2014 09:58:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=4/9QBf/qVtRvOsL77zhfH1sv+Ct0Ojn8UAX6JYNoQcg=; b=Y0RsLM2OoSG21PpPR84+r71Cf3LGao6YIk4JDP76NWfo3SSorTuA+Y20o9PgaFhtvY saXByvKkUMzcKrcYsPmRX5amSOfABWlyC1WjHXQNvwbYobkD9Miy93iaYNv240nzmofd etbAZMTBczzgCLPt0b8Nj/99vumrsicSDh4h64jLq04PUBd/DMIB8UElT/S8TOmrDbHN EqY1MRK9s/i9Y8fAvAC1p6AuFc2EDZqrSeK2QHed0h82tvec25e685XXqO6iBTb6MTXG aZZM1Qr9h5GAfH0Fbxxi7b6cS0QPrays72zozKhAy7ZpcdYPxczxftcjjeMgwML5j6ik OyOw== X-Received: by 10.236.169.9 with SMTP id m9mr1651311yhl.137.1392314324292; Thu, 13 Feb 2014 09:58:44 -0800 (PST) MIME-Version: 1.0 Received: by 10.170.84.65 with HTTP; Thu, 13 Feb 2014 09:58:04 -0800 (PST) In-Reply-To: <83y51fq8fy.fsf@gnu.org> References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> From: Juanma Barranquero Date: Thu, 13 Feb 2014 18:58:04 +0100 Message-ID: Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case To: Eli Zaretskii Content-Type: text/plain; charset=UTF-8 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 16731 Cc: Stefan Monnier , 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Thu, Feb 13, 2014 at 5:33 PM, Eli Zaretskii wrote: > If the approach below is accepted, a related question is how to treat > letters whose category is Lt, i.e. "titlecase" -- do we consider such > letters upper case or don't we? No Unicode expert, but this suggest they are uppercase, sort of: http://www.unicode.org/faq/casemap_charprop.html "Q: What is titlecase? How is it different from uppercase? A: Titlecase takes its name from the case format used when forming a title, in which the initial letter in a word is capitalized and the rest are not. Titlecase is also used in forming a sentence by capitalizing the first word, and for forming proper names. The titlecase mapping in the Unicode Standard is the mapping applied to the initial character in a word. The titlecase mapping in Unicode differs from the uppercase mapping in that a number of characters require special handling. These are chiefly ligatures and digraphs such as 'fl', 'dz', and 'lj', plus a number of polytonic Greek characters. For example, U+01C7 (LJ) maps to U+01C8 (Lj) rather than to U+01C9 (lj)." From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 13:10:11 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 18:10:11 +0000 Received: from localhost ([127.0.0.1]:51120 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE0jj-00074M-6m for submit@debbugs.gnu.org; Thu, 13 Feb 2014 13:10:11 -0500 Received: from ironport2-out.teksavvy.com ([206.248.154.181]:64151) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE0jg-00073u-3o for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 13:10:08 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Av8EABK/CFFLd+mu/2dsb2JhbABEuzWDWRdzgh4BAQQBViMFCws0EhQYDSSIHgbBLZEKA4hhnBmBXoMV X-IPAS-Result: Av8EABK/CFFLd+mu/2dsb2JhbABEuzWDWRdzgh4BAQQBViMFCws0EhQYDSSIHgbBLZEKA4hhnBmBXoMV X-IronPort-AV: E=Sophos;i="4.84,565,1355115600"; d="scan'208";a="47569651" Received: from 75-119-233-174.dsl.teksavvy.com (HELO pastel.home) ([75.119.233.174]) by ironport2-out.teksavvy.com with ESMTP/TLS/ADH-AES256-SHA; 13 Feb 2014 13:10:02 -0500 Received: by pastel.home (Postfix, from userid 20848) id 6B619600EB; Thu, 13 Feb 2014 13:10:02 -0500 (EST) From: Stefan Monnier To: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case Message-ID: References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> <83r476rjyf.fsf@gnu.org> Date: Thu, 13 Feb 2014 13:10:02 -0500 In-Reply-To: <83r476rjyf.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 13 Feb 2014 19:39:04 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 16731 Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.3 (/) >> Doesn't sound too bad. But it does beg the question: why check >> (downcase (c) != c) at all, then? > Because it's faster, Is it? Both lookups look like CHAR_TABLE_REF to me. > and for most characters will do the job. But we'll check the unicode table at least for more than half the characters (i.e. for all the lowercase and non-case characters), so the fast path can't give us more than a factor of 2 speed up anyway, and the slow path is made slower by unnecessarily looking up the case table. I guess what I mean is that without actual measurements it's not obvious at all that speed is a good justification. Stefan From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 13:17:03 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 18:17:03 +0000 Received: from localhost ([127.0.0.1]:51131 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE0qM-0007H6-I5 for submit@debbugs.gnu.org; Thu, 13 Feb 2014 13:17:02 -0500 Received: from mtaout20.012.net.il ([80.179.55.166]:35344) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE0qJ-0007GV-3V for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 13:17:00 -0500 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0N0Y00C004TKTT00@a-mtaout20.012.net.il> for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 20:16:52 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N0Y00C515G20UA0@a-mtaout20.012.net.il>; Thu, 13 Feb 2014 20:16:52 +0200 (IST) Date: Thu, 13 Feb 2014 20:16:45 +0200 From: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case In-reply-to: X-012-Sender: halo1@inter.net.il To: Stefan Monnier Message-id: <8338jmev3m.fsf@gnu.org> References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> <83r476rjyf.fsf@gnu.org> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16731 Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Stefan Monnier > Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org > Date: Thu, 13 Feb 2014 13:10:02 -0500 > > >> Doesn't sound too bad. But it does beg the question: why check > >> (downcase (c) != c) at all, then? > > Because it's faster, > > Is it? Both lookups look like CHAR_TABLE_REF to me. > > > and for most characters will do the job. > > But we'll check the unicode table at least for more than half the > characters (i.e. for all the lowercase and non-case characters), so the > fast path can't give us more than a factor of 2 speed up anyway, and the > slow path is made slower by unnecessarily looking up the case table. > > I guess what I mean is that without actual measurements it's not obvious > at all that speed is a good justification. What about custom buffer-local case tables? From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 13:18:13 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 18:18:13 +0000 Received: from localhost ([127.0.0.1]:51136 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE0rV-0007Jh-3P for submit@debbugs.gnu.org; Thu, 13 Feb 2014 13:18:13 -0500 Received: from mtaout20.012.net.il ([80.179.55.166]:35569) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE0rT-0007JO-3K for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 13:18:11 -0500 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0N0Y00C004TKTT00@a-mtaout20.012.net.il> for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 20:18:05 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N0Y00CC55I50UA0@a-mtaout20.012.net.il>; Thu, 13 Feb 2014 20:18:05 +0200 (IST) Date: Thu, 13 Feb 2014 20:17:59 +0200 From: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case In-reply-to: <52FD08A0.1070300@easy-emacs.de> X-012-Sender: halo1@inter.net.il To: Andreas =?iso-8859-15?Q?R=F6hler?= Message-id: <831tz6ev1k.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=iso-8859-15 Content-transfer-encoding: 8BIT References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> <83r476rjyf.fsf@gnu.org> <52FD08A0.1070300@easy-emacs.de> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16731 Cc: monnier@iro.umontreal.ca, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Thu, 13 Feb 2014 19:02:08 +0100 > From: Andreas Röhler > CC: 16731@debbugs.gnu.org > > Maybe I'm missing the point: all change needed is not to store "ß" into the uppercase-table. > Why not store nil there instead? Because that's not what case tables are documented to hold. We will break back compatibility if we put nil there. From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 13:19:13 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 18:19:14 +0000 Received: from localhost ([127.0.0.1]:51140 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE0sT-0007La-EX for submit@debbugs.gnu.org; Thu, 13 Feb 2014 13:19:13 -0500 Received: from mtaout28.012.net.il ([80.179.55.184]:43662) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE0sR-0007LL-K7 for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 13:19:12 -0500 Received: from conversion-daemon.mtaout28.012.net.il by mtaout28.012.net.il (HyperSendmail v2007.08) id <0N0Y00L0057SGO00@mtaout28.012.net.il> for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 20:20:01 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout28.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N0Y00IUT5LCLV50@mtaout28.012.net.il>; Thu, 13 Feb 2014 20:20:01 +0200 (IST) Date: Thu, 13 Feb 2014 20:18:59 +0200 From: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case In-reply-to: X-012-Sender: halo1@inter.net.il To: Juanma Barranquero Message-id: <83zjludgfg.fsf@gnu.org> References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16731 Cc: monnier@iro.umontreal.ca, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Juanma Barranquero > Date: Thu, 13 Feb 2014 18:58:04 +0100 > Cc: Stefan Monnier , 16731@debbugs.gnu.org > > On Thu, Feb 13, 2014 at 5:33 PM, Eli Zaretskii wrote: > > > If the approach below is accepted, a related question is how to treat > > letters whose category is Lt, i.e. "titlecase" -- do we consider such > > letters upper case or don't we? > > No Unicode expert, but this suggest they are uppercase, sort of: > > http://www.unicode.org/faq/casemap_charprop.html > > "Q: What is titlecase? How is it different from uppercase? > > A: Titlecase takes its name from the case format used when forming a > title, in which the initial letter in a word is capitalized and the > rest are not. Titlecase is also used in forming a sentence by > capitalizing the first word, and for forming proper names. The > titlecase mapping in the Unicode Standard is the mapping applied to > the initial character in a word. > > The titlecase mapping in Unicode differs from the uppercase mapping in > that a number of characters require special handling. These are > chiefly ligatures and digraphs such as 'fl', 'dz', and 'lj', plus a > number of polytonic Greek characters. For example, U+01C7 (LJ) maps to > U+01C8 (Lj) rather than to U+01C9 (lj)." The question is whether we want [:upper:] to match titlecase letters. From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 13:23:37 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 18:23:37 +0000 Received: from localhost ([127.0.0.1]:51148 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE0wj-0007Ta-DM for submit@debbugs.gnu.org; Thu, 13 Feb 2014 13:23:37 -0500 Received: from mail-yk0-f174.google.com ([209.85.160.174]:63538) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE0wh-0007TL-Ms for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 13:23:36 -0500 Received: by mail-yk0-f174.google.com with SMTP id 20so20269958yks.5 for <16731@debbugs.gnu.org>; Thu, 13 Feb 2014 10:23:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=aKKpeMVe/MFJoy5YrmD6SPr6TygMiCqk+go4Z258C9Q=; b=SHOuu1ehZWj+9SbsdgT5UsFCv75yTEB+3P3mDpuba9MlA1bdgrJHpojB9yRdyE8K0M vG8sf9vvtui1mmalJA6KmEebNEvIg8ZVO3dZRDvjc4blyQj9O/YOBt8TPVf0eOBgzBP+ Kdu6yvMdm9bt4Nwu4WzZWTtejVguCpyNQfxIZmXu0fDJ2GCOwC002SV+4xPihGJAqeN1 eQM6MJoy/lk+kY72j8vc1sZiQnU9iSuHSI6z8UoBC5TlN0PJyNW9aXVvgVkHdb4JaleK dw/5vofpPN+nEdtwwn3EGkSXzlHuWiy5hskd6yCvQwrWpDkiEHkKrSOCdk93FalEA6vs tazA== X-Received: by 10.236.132.74 with SMTP id n50mr2659575yhi.20.1392315810079; Thu, 13 Feb 2014 10:23:30 -0800 (PST) MIME-Version: 1.0 Received: by 10.170.84.65 with HTTP; Thu, 13 Feb 2014 10:22:49 -0800 (PST) In-Reply-To: <83zjludgfg.fsf@gnu.org> References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> <83zjludgfg.fsf@gnu.org> From: Juanma Barranquero Date: Thu, 13 Feb 2014 19:22:49 +0100 Message-ID: Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case To: Eli Zaretskii Content-Type: text/plain; charset=UTF-8 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 16731 Cc: Stefan Monnier , 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Thu, Feb 13, 2014 at 7:18 PM, Eli Zaretskii wrote: > The question is whether we want [:upper:] to match titlecase letters. Yes, I understand. And I'm pointing out that, unless there's a separate [:title:] matcher, matching them with [:upper:] is not entirely unreasonable. Whether it is the right thing to do or not will depend on the uses, I think. J From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 13:47:37 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 18:47:37 +0000 Received: from localhost ([127.0.0.1]:51159 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE1Jx-0008A7-FY for submit@debbugs.gnu.org; Thu, 13 Feb 2014 13:47:37 -0500 Received: from fencepost.gnu.org ([208.118.235.10]:44946 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE1Jv-00089x-4r for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 13:47:35 -0500 Received: from rgm by fencepost.gnu.org with local (Exim 4.71) (envelope-from ) id 1WE1Ju-0006n2-1d; Thu, 13 Feb 2014 13:47:34 -0500 From: Glenn Morris To: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> <83zjludgfg.fsf@gnu.org> X-Spook: Agfa Mantis virus national information infrastructure NWO X-Ran: JdO;="B{BW&'hX(D4@P,D0|(CT-l\kr69k2W&=rC$]R'ps#k`%OICW,Kr@SU:)TWL*8P;p X-Hue: white X-Debbugs-No-Ack: yes X-Attribution: GM Date: Thu, 13 Feb 2014 13:47:33 -0500 In-Reply-To: <83zjludgfg.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 13 Feb 2014 20:18:59 +0200") Message-ID: User-Agent: Gnus (www.gnus.org), GNU Emacs (www.gnu.org/software/emacs/) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Score: -4.9 (----) X-Debbugs-Envelope-To: 16731 Cc: Juanma Barranquero , 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.9 (----) Eli Zaretskii wrote: > The question is whether we want [:upper:] to match titlecase letters. What does grep do? (http://debbugs.gnu.org/16631 ?) From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 14:15:45 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 19:15:45 +0000 Received: from localhost ([127.0.0.1]:51169 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE1lA-0000W4-JO for submit@debbugs.gnu.org; Thu, 13 Feb 2014 14:15:45 -0500 Received: from ironport2-out.teksavvy.com ([206.248.154.181]:25286) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE1l9-0000Vp-A0 for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 14:15:43 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Av8EABK/CFFLd+mu/2dsb2JhbABEuzWDWRdzgh4BAQQBViMFCws0EhQYDSSIHgbBLZEKA4hhnBmBXoMV X-IPAS-Result: Av8EABK/CFFLd+mu/2dsb2JhbABEuzWDWRdzgh4BAQQBViMFCws0EhQYDSSIHgbBLZEKA4hhnBmBXoMV X-IronPort-AV: E=Sophos;i="4.84,565,1355115600"; d="scan'208";a="47576438" Received: from 75-119-233-174.dsl.teksavvy.com (HELO pastel.home) ([75.119.233.174]) by ironport2-out.teksavvy.com with ESMTP/TLS/ADH-AES256-SHA; 13 Feb 2014 14:15:37 -0500 Received: by pastel.home (Postfix, from userid 20848) id AEF29600EB; Thu, 13 Feb 2014 14:15:37 -0500 (EST) From: Stefan Monnier To: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case Message-ID: References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> <83r476rjyf.fsf@gnu.org> <8338jmev3m.fsf@gnu.org> Date: Thu, 13 Feb 2014 14:15:37 -0500 In-Reply-To: <8338jmev3m.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 13 Feb 2014 20:16:45 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 16731 Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.3 (/) > What about custom buffer-local case tables? That's what I meant by my question, yes. Your change will break about half of the uses of buffer-local case tables. Using the unicode table all the time will break them all. Is it a real issue? I really don't know. Stefan From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 15:16:16 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 20:16:16 +0000 Received: from localhost ([127.0.0.1]:51218 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE2hk-0002Gg-7l for submit@debbugs.gnu.org; Thu, 13 Feb 2014 15:16:16 -0500 Received: from mtaout26.012.net.il ([80.179.55.182]:60734) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE2hh-0002GO-1P for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 15:16:14 -0500 Received: from conversion-daemon.mtaout26.012.net.il by mtaout26.012.net.il (HyperSendmail v2007.08) id <0N0Y00D00AS2CN00@mtaout26.012.net.il> for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 22:14:39 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout26.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N0Y006KSAWFWF60@mtaout26.012.net.il>; Thu, 13 Feb 2014 22:14:39 +0200 (IST) Date: Thu, 13 Feb 2014 22:16:00 +0200 From: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case In-reply-to: X-012-Sender: halo1@inter.net.il To: Glenn Morris Message-id: <83wqgydb0f.fsf@gnu.org> References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> <83zjludgfg.fsf@gnu.org> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16731 Cc: lekktu@gmail.com, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Glenn Morris > Cc: Juanma Barranquero , 16731@debbugs.gnu.org > Date: Thu, 13 Feb 2014 13:47:33 -0500 > > Eli Zaretskii wrote: > > > The question is whether we want [:upper:] to match titlecase letters. > > What does grep do? > (http://debbugs.gnu.org/16631 ?) Grep (like most of other programs) uses locale-dependent tables provided by libc, so it's not really relevant for us what it does. From debbugs-submit-bounces@debbugs.gnu.org Thu Feb 13 15:25:12 2014 Received: (at 16731) by debbugs.gnu.org; 13 Feb 2014 20:25:12 +0000 Received: from localhost ([127.0.0.1]:51230 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE2qL-0002XC-OU for submit@debbugs.gnu.org; Thu, 13 Feb 2014 15:25:10 -0500 Received: from mtaout23.012.net.il ([80.179.55.175]:35387) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WE2qG-0002Wc-Qg for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 15:25:06 -0500 Received: from conversion-daemon.a-mtaout23.012.net.il by a-mtaout23.012.net.il (HyperSendmail v2007.08) id <0N0Y00L00B96PP00@a-mtaout23.012.net.il> for 16731@debbugs.gnu.org; Thu, 13 Feb 2014 22:24:58 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout23.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N0Y00KPFBDLSIA0@a-mtaout23.012.net.il>; Thu, 13 Feb 2014 22:24:58 +0200 (IST) Date: Thu, 13 Feb 2014 22:24:52 +0200 From: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case In-reply-to: X-012-Sender: halo1@inter.net.il To: Stefan Monnier Message-id: <83vbwidaln.fsf@gnu.org> References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> <83r476rjyf.fsf@gnu.org> <8338jmev3m.fsf@gnu.org> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16731 Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Stefan Monnier > Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org > Date: Thu, 13 Feb 2014 14:15:37 -0500 > > > What about custom buffer-local case tables? > > That's what I meant by my question, yes. Your change will break about half of > the uses of buffer-local case tables. Using the unicode table all the > time will break them all. > Is it a real issue? I really don't know. Neither do I. How about if we use the unicode tables only if the corresponding buffer's case-table is the standard one (Vascii_*_table)? From debbugs-submit-bounces@debbugs.gnu.org Fri Feb 14 11:20:58 2014 Received: (at 16731) by debbugs.gnu.org; 14 Feb 2014 16:20:58 +0000 Received: from localhost ([127.0.0.1]:52535 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WELVZ-0008DD-Tk for submit@debbugs.gnu.org; Fri, 14 Feb 2014 11:20:58 -0500 Received: from smtp.cs.ucla.edu ([131.179.128.62]:33247) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WELVX-0008Cx-4h for 16731@debbugs.gnu.org; Fri, 14 Feb 2014 11:20:56 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 91F9AA60005 for <16731@debbugs.gnu.org>; Fri, 14 Feb 2014 08:20:49 -0800 (PST) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tD+uvIsAadAI for <16731@debbugs.gnu.org>; Fri, 14 Feb 2014 08:20:49 -0800 (PST) Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net [108.0.233.62]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 1232A39E8015 for <16731@debbugs.gnu.org>; Fri, 14 Feb 2014 08:20:49 -0800 (PST) Message-ID: <52FE4260.1000509@cs.ucla.edu> Date: Fri, 14 Feb 2014 08:20:48 -0800 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: 16731@debbugs.gnu.org Subject: Re: bug#16731: 24.3.50; , Latin small letter sharp s is not considered lower-case Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -3.0 (---) X-Debbugs-Envelope-To: 16731 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) Grep doesn't just use glibc's tables; it has its own dfa matcher (also shared by awk), and runs into problem in this area as well. I'm working on fixes for this in my limited spare time. If you want 'uppercasep' to match what glibc and grep mean by [[:upper:]], Emacs might need to check not merely for UNICODE_CATEGORY_Lu but also for other Unicode categories (mixed case, title case). I haven't investigated the details. From debbugs-submit-bounces@debbugs.gnu.org Fri Feb 14 12:22:59 2014 Received: (at 16731) by debbugs.gnu.org; 14 Feb 2014 17:22:59 +0000 Received: from localhost ([127.0.0.1]:52574 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WEMTb-0002mZ-DG for submit@debbugs.gnu.org; Fri, 14 Feb 2014 12:22:59 -0500 Received: from mercure.iro.umontreal.ca ([132.204.24.67]:35142) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WEMTa-0002mR-8l for 16731@debbugs.gnu.org; Fri, 14 Feb 2014 12:22:58 -0500 Received: from hidalgo.iro.umontreal.ca (hidalgo.iro.umontreal.ca [132.204.27.50]) by mercure.iro.umontreal.ca (Postfix) with ESMTP id 0E36C84CEC; Fri, 14 Feb 2014 12:22:58 -0500 (EST) Received: from lechon.iro.umontreal.ca (lechon.iro.umontreal.ca [132.204.27.242]) by hidalgo.iro.umontreal.ca (Postfix) with ESMTP id 4B2ED1E5913; Fri, 14 Feb 2014 12:22:35 -0500 (EST) Received: by lechon.iro.umontreal.ca (Postfix, from userid 20848) id 39031B40FE; Fri, 14 Feb 2014 12:22:35 -0500 (EST) From: Stefan Monnier To: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case Message-ID: References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> <83r476rjyf.fsf@gnu.org> <8338jmev3m.fsf@gnu.org> <83vbwidaln.fsf@gnu.org> Date: Fri, 14 Feb 2014 12:22:35 -0500 In-Reply-To: <83vbwidaln.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 13 Feb 2014 22:24:52 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-DIRO-MailScanner-Information: Please contact the ISP for more information X-DIRO-MailScanner: Found to be clean X-DIRO-MailScanner-SpamCheck: n'est pas un polluriel, SpamAssassin (score=-2.82, requis 5, autolearn=not spam, ALL_TRUSTED -2.82, MC_TSTLAST 0.00) X-DIRO-MailScanner-From: monnier@iro.umontreal.ca X-Spam-Status: No X-Spam-Score: -3.0 (---) X-Debbugs-Envelope-To: 16731 Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) >> Is it a real issue? I really don't know. > Neither do I. Maybe it's not a problem. Someone(TM) should grep to try and figure it out, and then try it out. > How about if we use the unicode tables only if the corresponding > buffer's case-table is the standard one (Vascii_*_table)? That sounds kludgy. Stefan From debbugs-submit-bounces@debbugs.gnu.org Fri Feb 14 13:16:22 2014 Received: (at 16731) by debbugs.gnu.org; 14 Feb 2014 18:16:23 +0000 Received: from localhost ([127.0.0.1]:52601 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WENJG-0005Zk-00 for submit@debbugs.gnu.org; Fri, 14 Feb 2014 13:16:22 -0500 Received: from mtaout20.012.net.il ([80.179.55.166]:54720) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WENJC-0005ZN-4h for 16731@debbugs.gnu.org; Fri, 14 Feb 2014 13:16:19 -0500 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0N0Z00500ZC9CO00@a-mtaout20.012.net.il> for 16731@debbugs.gnu.org; Fri, 14 Feb 2014 20:16:11 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N10004RP02YUV90@a-mtaout20.012.net.il>; Fri, 14 Feb 2014 20:16:11 +0200 (IST) Date: Fri, 14 Feb 2014 20:16:08 +0200 From: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case In-reply-to: X-012-Sender: halo1@inter.net.il To: Stefan Monnier Message-id: <83zjltblw7.fsf@gnu.org> References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> <83r476rjyf.fsf@gnu.org> <8338jmev3m.fsf@gnu.org> <83vbwidaln.fsf@gnu.org> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16731 Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Stefan Monnier > Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org > Date: Fri, 14 Feb 2014 12:22:35 -0500 > > > How about if we use the unicode tables only if the corresponding > > buffer's case-table is the standard one (Vascii_*_table)? > > That sounds kludgy. Why kludgy? If the tables were not customized, it is a sign that this buffer is OK with the default properties, which is what the Unicode properties are about. From debbugs-submit-bounces@debbugs.gnu.org Fri Feb 14 15:59:28 2014 Received: (at 16731) by debbugs.gnu.org; 14 Feb 2014 20:59:28 +0000 Received: from localhost ([127.0.0.1]:52699 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WEPr5-00029c-5M for submit@debbugs.gnu.org; Fri, 14 Feb 2014 15:59:27 -0500 Received: from mercure.iro.umontreal.ca ([132.204.24.67]:33852) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WEPr3-00029U-L5 for 16731@debbugs.gnu.org; Fri, 14 Feb 2014 15:59:26 -0500 Received: from hidalgo.iro.umontreal.ca (hidalgo.iro.umontreal.ca [132.204.27.50]) by mercure.iro.umontreal.ca (Postfix) with ESMTP id EB204848BD; Fri, 14 Feb 2014 15:59:24 -0500 (EST) Received: from lechon.iro.umontreal.ca (lechon.iro.umontreal.ca [132.204.27.242]) by hidalgo.iro.umontreal.ca (Postfix) with ESMTP id 990331E5B74; Fri, 14 Feb 2014 15:59:00 -0500 (EST) Received: by lechon.iro.umontreal.ca (Postfix, from userid 20848) id 77561B40FE; Fri, 14 Feb 2014 15:59:00 -0500 (EST) From: Stefan Monnier To: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case Message-ID: References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> <83r476rjyf.fsf@gnu.org> <8338jmev3m.fsf@gnu.org> <83vbwidaln.fsf@gnu.org> <83zjltblw7.fsf@gnu.org> Date: Fri, 14 Feb 2014 15:59:00 -0500 In-Reply-To: <83zjltblw7.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 14 Feb 2014 20:16:08 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-DIRO-MailScanner-Information: Please contact the ISP for more information X-DIRO-MailScanner: Found to be clean X-DIRO-MailScanner-SpamCheck: n'est pas un polluriel, SpamAssassin (score=-2.82, requis 5, autolearn=not spam, ALL_TRUSTED -2.82, MC_TSTLAST 0.00) X-DIRO-MailScanner-From: monnier@iro.umontreal.ca X-Spam-Status: No X-Spam-Score: -3.0 (---) X-Debbugs-Envelope-To: 16731 Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) >> > How about if we use the unicode tables only if the corresponding >> > buffer's case-table is the standard one (Vascii_*_table)? >> That sounds kludgy. > Why kludgy? Because, if someone were to take the Vascii_*_table, make a little change to them and use them in a buffer, he suddenly gets different behavior for some chars he hasn't touched. Stefan From debbugs-submit-bounces@debbugs.gnu.org Sat Feb 15 02:12:50 2014 Received: (at 16731) by debbugs.gnu.org; 15 Feb 2014 07:12:50 +0000 Received: from localhost ([127.0.0.1]:52881 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WEZQg-0004d3-8h for submit@debbugs.gnu.org; Sat, 15 Feb 2014 02:12:50 -0500 Received: from mtaout24.012.net.il ([80.179.55.180]:50778) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WEZQd-0004cd-CX for 16731@debbugs.gnu.org; Sat, 15 Feb 2014 02:12:48 -0500 Received: from conversion-daemon.mtaout24.012.net.il by mtaout24.012.net.il (HyperSendmail v2007.08) id <0N1000M00ZUZ9D00@mtaout24.012.net.il> for 16731@debbugs.gnu.org; Sat, 15 Feb 2014 09:11:36 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout24.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N1000KTQZZCN820@mtaout24.012.net.il>; Sat, 15 Feb 2014 09:11:36 +0200 (IST) Date: Sat, 15 Feb 2014 09:12:39 +0200 From: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case In-reply-to: X-012-Sender: halo1@inter.net.il To: Stefan Monnier Message-id: <83ob28c0ig.fsf@gnu.org> References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> <83r476rjyf.fsf@gnu.org> <8338jmev3m.fsf@gnu.org> <83vbwidaln.fsf@gnu.org> <83zjltblw7.fsf@gnu.org> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16731 Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Stefan Monnier > Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org > Date: Fri, 14 Feb 2014 15:59:00 -0500 > > >> > How about if we use the unicode tables only if the corresponding > >> > buffer's case-table is the standard one (Vascii_*_table)? > >> That sounds kludgy. > > Why kludgy? > > Because, if someone were to take the Vascii_*_table How could they? these variables are not exposed to Lisp. Only ascii-case-table is, which is not the one I had in mind. From debbugs-submit-bounces@debbugs.gnu.org Sun Feb 16 22:09:41 2014 Received: (at 16731) by debbugs.gnu.org; 17 Feb 2014 03:09:41 +0000 Received: from localhost ([127.0.0.1]:55418 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WFEaT-0001JQ-6u for submit@debbugs.gnu.org; Sun, 16 Feb 2014 22:09:41 -0500 Received: from ironport2-out.teksavvy.com ([206.248.154.181]:44684) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WFEaQ-0001J6-Jn for 16731@debbugs.gnu.org; Sun, 16 Feb 2014 22:09:39 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Av8EABK/CFFMCo7M/2dsb2JhbABEuzWDWRdzgh4BAQQBViMFCws0EhQYDSSIHgbBLZEKA4hhnBmBXoMV X-IPAS-Result: Av8EABK/CFFMCo7M/2dsb2JhbABEuzWDWRdzgh4BAQQBViMFCws0EhQYDSSIHgbBLZEKA4hhnBmBXoMV X-IronPort-AV: E=Sophos;i="4.84,565,1355115600"; d="scan'208";a="47844455" Received: from 76-10-142-204.dsl.teksavvy.com (HELO pastel.home) ([76.10.142.204]) by ironport2-out.teksavvy.com with ESMTP/TLS/ADH-AES256-SHA; 16 Feb 2014 22:09:32 -0500 Received: by pastel.home (Postfix, from userid 20848) id C1940600A2; Sun, 16 Feb 2014 22:09:32 -0500 (EST) From: Stefan Monnier To: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case Message-ID: References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> <83r476rjyf.fsf@gnu.org> <8338jmev3m.fsf@gnu.org> <83vbwidaln.fsf@gnu.org> <83zjltblw7.fsf@gnu.org> <83ob28c0ig.fsf@gnu.org> Date: Sun, 16 Feb 2014 22:09:32 -0500 In-Reply-To: <83ob28c0ig.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 15 Feb 2014 09:12:39 +0200") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 16731 Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.3 (/) > How could they? these variables are not exposed to Lisp. Only > ascii-case-table is, which is not the one I had in mind. Right, I was thinking of standard-case-table. Still, same problem: take that standard case table change it a bit, and suddenly other chars than the ones you changed are affected. Stefan From debbugs-submit-bounces@debbugs.gnu.org Mon Feb 17 00:29:36 2014 Received: (at 16731) by debbugs.gnu.org; 17 Feb 2014 05:29:36 +0000 Received: from localhost ([127.0.0.1]:55570 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WFGlr-00058g-VC for submit@debbugs.gnu.org; Mon, 17 Feb 2014 00:29:36 -0500 Received: from mtaout24.012.net.il ([80.179.55.180]:36621) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WFGlp-00058O-CH for 16731@debbugs.gnu.org; Mon, 17 Feb 2014 00:29:34 -0500 Received: from conversion-daemon.mtaout24.012.net.il by mtaout24.012.net.il (HyperSendmail v2007.08) id <0N1400700K4SOQ00@mtaout24.012.net.il> for 16731@debbugs.gnu.org; Mon, 17 Feb 2014 07:28:18 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout24.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0N14006EYKJ6B330@mtaout24.012.net.il>; Mon, 17 Feb 2014 07:28:18 +0200 (IST) Date: Mon, 17 Feb 2014 07:29:31 +0200 From: Eli Zaretskii Subject: Re: bug#16731: 24.3.50; Latin small letter sharp s is not considered lower-case In-reply-to: X-012-Sender: halo1@inter.net.il To: Stefan Monnier Message-id: <838uta9uis.fsf@gnu.org> References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> <57txc4cj1x.fsf@fencepost.gnu.org> <52FBCC08.50509@easy-emacs.de> <83fvnoru0q.fsf@gnu.org> <52FBD551.2000808@easy-emacs.de> <83bnycrsrb.fsf@gnu.org> <52FBDA9B.102@easy-emacs.de> <83a9dvsmi9.fsf@gnu.org> <83y51fq8fy.fsf@gnu.org> <83r476rjyf.fsf@gnu.org> <8338jmev3m.fsf@gnu.org> <83vbwidaln.fsf@gnu.org> <83zjltblw7.fsf@gnu.org> <83ob28c0ig.fsf@gnu.org> X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 16731 Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Eli Zaretskii List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > From: Stefan Monnier > Cc: andreas.roehler@easy-emacs.de, 16731@debbugs.gnu.org > Date: Sun, 16 Feb 2014 22:09:32 -0500 > > > How could they? these variables are not exposed to Lisp. Only > > ascii-case-table is, which is not the one I had in mind. > > Right, I was thinking of standard-case-table. Still, same problem: take > that standard case table change it a bit, and suddenly other chars than > the ones you changed are affected. But customizing case-tables is already a very special use case. Why can't we expect such users to deal with these issues? The only alternative (besides leaving the original problem unsolved) is to ignore buffer-local case tables. Is this more acceptable? From debbugs-submit-bounces@debbugs.gnu.org Fri Jul 16 08:32:53 2021 Received: (at 16731) by debbugs.gnu.org; 16 Jul 2021 12:32:53 +0000 Received: from localhost ([127.0.0.1]:50783 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m4N1R-0003in-DL for submit@debbugs.gnu.org; Fri, 16 Jul 2021 08:32:53 -0400 Received: from quimby.gnus.org ([95.216.78.240]:40628) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m4N1O-0003iS-9i; Fri, 16 Jul 2021 08:32:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID :In-Reply-To:Date:References:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=AGauHm2u+4qxWbrQ6rGgO7N/TtEbdJXnhTktumF+u/Q=; b=I8G40o6hC3CHVQir/95oF4Tkk4 A8S1QDSq5+Oyrika7N/LYGsUqvo4/qt9KDUBiSmuBkUdDTweHoR9i46kDCT4eMjJqBiUSqAowwn0q G7mFR57yRjWM9HyagwE2+9buYYe+TC14jUriBk5wrBAI6wrjrWk8D40Hcl4FE+ExsC5M=; Received: from cm-84.212.220.105.getinternet.no ([84.212.220.105] helo=elva) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1m4N1G-0004JT-0T; Fri, 16 Jul 2021 14:32:44 +0200 From: Lars Ingebrigtsen To: Jorgen Schaefer Subject: Re: bug#10576: Subject: 23.4; char class [:lower:] misses latin small letter sharp s References: <87wqh08cjw.fsf@loki.jorgenschaefer.de> X-Now-Playing: Late Night Approach's _Fabric 94: Steffi_: "Poison Valley" Date: Fri, 16 Jul 2021 14:32:41 +0200 In-Reply-To: <87wqh08cjw.fsf@loki.jorgenschaefer.de> (Jorgen Schaefer's message of "Wed, 12 Feb 2014 18:29:23 +0100") Message-ID: <87wnpq8mmu.fsf@gnus.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: Jorgen Schaefer writes: > The following seems like a bug: > > (string-match "[[:lower:]]" "ß") => nil This has been fixed in Emacs 28. Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 16731 Cc: 10576@debbugs.gnu.org, 16731@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Jorgen Schaefer writes: > The following seems like a bug: > > (string-match "[[:lower:]]" "=C3=9F") =3D> nil This has been fixed in Emacs 28. --=20 (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no From debbugs-submit-bounces@debbugs.gnu.org Fri Jul 16 08:32:58 2021 Received: (at control) by debbugs.gnu.org; 16 Jul 2021 12:32:58 +0000 Received: from localhost ([127.0.0.1]:50786 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m4N1W-0003jC-Mn for submit@debbugs.gnu.org; Fri, 16 Jul 2021 08:32:58 -0400 Received: from quimby.gnus.org ([95.216.78.240]:40642) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1m4N1V-0003ij-HB for control@debbugs.gnu.org; Fri, 16 Jul 2021 08:32:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Subject:From:To:Message-Id:Date:Sender:Reply-To:Cc: MIME-Version:Content-Type:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=bKeXEU7LFZ22CM1yVRNPxfOPdc8D0XnGB9lKYkgj1/Y=; b=feMC13R0l8TDokaVkBiE+LjXCG UJDw8eheXm9FZDSKLPSkEWLhh97hGkHdCGZnTW2+uwlRK2gOoqe56oe7FT6RPhwUolf7/b+/wFwuC RUzDXq+r3ss7eUuIGTBv/ytSo6IXj4V6wbFKZrDQKQq856yK81ZWWzK+oCa1gsD3YBc0=; Received: from cm-84.212.220.105.getinternet.no ([84.212.220.105] helo=elva) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1m4N1N-0004Jd-Vz for control@debbugs.gnu.org; Fri, 16 Jul 2021 14:32:52 +0200 Date: Fri, 16 Jul 2021 14:32:49 +0200 Message-Id: <87v95a8mmm.fsf@gnus.org> To: control@debbugs.gnu.org From: Lars Ingebrigtsen Subject: control message for bug #10576 X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: close 10576 28.1 quit Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) close 10576 28.1 quit From unknown Sat Jun 21 10:31:14 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sat, 14 Aug 2021 11:24:05 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator