From unknown Tue Aug 19 10:01:20 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#11309 <11309@debbugs.gnu.org> To: bug#11309 <11309@debbugs.gnu.org> Subject: Status: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek Reply-To: bug#11309 <11309@debbugs.gnu.org> Date: Tue, 19 Aug 2025 17:01:20 +0000 retitle 11309 24.1.50; Case problems with [:upper:] and Cyrillic, Greek reassign 11309 emacs submitter 11309 Aidan Kehoe severity 11309 normal tag 11309 patch thanks From debbugs-submit-bounces@debbugs.gnu.org Sun Apr 22 06:12:32 2012 Received: (at submit) by debbugs.gnu.org; 22 Apr 2012 10:12:32 +0000 Received: from localhost ([127.0.0.1]:46489 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SLtmR-0003u3-3Z for submit@debbugs.gnu.org; Sun, 22 Apr 2012 06:12:31 -0400 Received: from eggs.gnu.org ([208.118.235.92]:36643) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SLtmN-0003tn-43 for submit@debbugs.gnu.org; Sun, 22 Apr 2012 06:12:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SLtlg-00058w-OQ for submit@debbugs.gnu.org; Sun, 22 Apr 2012 06:11:46 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI, TO_NO_BRKTS_PCNT autolearn=unavailable version=3.3.2 Received: from lists.gnu.org ([208.118.235.17]:50429) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SLtlg-00058s-LB for submit@debbugs.gnu.org; Sun, 22 Apr 2012 06:11:44 -0400 Received: from eggs.gnu.org ([208.118.235.92]:45577) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SLtle-00023Z-Gh for bug-gnu-emacs@gnu.org; Sun, 22 Apr 2012 06:11:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SLtlb-00058Q-6C for bug-gnu-emacs@gnu.org; Sun, 22 Apr 2012 06:11:41 -0400 Received: from zeus.asclepian.ie ([78.47.46.212]:51788) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SLtla-00057s-QD for bug-gnu-emacs@gnu.org; Sun, 22 Apr 2012 06:11:39 -0400 Received: by zeus.asclepian.ie (Postfix, from userid 1002) id 4F5F782D52A; Sun, 22 Apr 2012 11:11:35 +0100 (IST) From: Aidan Kehoe MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Message-ID: <4f93d952.287f22e03@parhasard.net> Date: Sun, 22 Apr 2012 11:11:30 +0100 To: bug-gnu-emacs@gnu.org Subject: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek X-Mailer: VM 8.0.12-devo-585 under 21.5 (beta31) "ginger" 5d3bb1100832 XEmacs Lucid (i386-apple-darwin10.8.0) X-NS5-file-as-sent: t Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 208.118.235.17 X-Spam-Score: -6.1 (------) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.1 (------) This bug report will be sent to the Bug-GNU-Emacs mailing list and the GNU bug tracker at debbugs.gnu.org. Please check that the From: line contains a valid email address. After a delay of up to one day, you should receive an acknowledgement at that address. Please write in English if possible, as the Emacs maintainers usually do not have translators for other languages. Please describe exactly what actions triggered the bug, and the precise symptoms of the bug. If you can, give a recipe starting from `emacs -Q': The Lisp manual says this when describing character classes: `[:lower:]' This matches any lower-case letter, as determined by the current case table (*note Case Tables::). If `case-fold-search' is non-`nil', this also matches any upper-case letter. And: `[:upper:]' This matches any upper-case letter, as determined by the current case table (*note Case Tables::). If `case-fold-search' is non-`nil', this also matches any lower-case letter. =20 OK, so let's test this: (let ((case-fold-search t)) (string-match "[[:upper:]]" "a\u0686")) =3D> 0 ;; As documented (upcase "\u0430") ;; CYRILLIC SMALL LETTER A =3D> "=D0=90" ;; "\u0410", so it's in the case table (let ((case-fold-search t)) (string-match "[[:upper:]]" "\u0430\u0686")) =3D> nil ;; Ah, this is unexpected. (let ((case-fold-search t)) (string-match "[[:lower:]]" "\u0410\u0686")) =3D> 0 ;; But this works as documented.=20 (upcase "\u03b2") ;; GREEK SMALL LETTER BETA =3D> "=CE=92" ;; "\u0392", it's in the case table (let ((case-fold-search t)) (string-match "[[:upper:]]" "\u03b2\u5357")) =3D> nil ;; Oops (let ((case-fold-search t)) (string-match "[[:lower:]]" "\u0392\u5357")) =3D> 0 ;; But this works, again.=20 If Emacs crashed, and you have the Emacs process in the gdb debugger, please include the output from the following gdb commands: `bt full' and `xbacktrace'. For information about debugging Emacs, please read the file /Sources/emacs/nextstep/Emacs.app/Contents/Resources/etc/DEBUG. In GNU Emacs 24.1.50.1 (i386-apple-darwin10.8.0, NS apple-appkit-1038.36) of 2012-04-22 on bonbon Windowing system distributor `Apple', version 10.3.1038 Configured using: `configure '--with-ns'' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: nil value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: de_DE.UTF-8 value of $XMODIFIERS: nil locale-coding-system: utf-8-unix default enable-multibyte-characters: t Major mode: Info Minor modes in effect: tooltip-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t Recent input: C-b C-b C-b C-b C-b C-b C-b C-f SPC \ x 7 f C-e C-j=20 C-p C-f C-f C-f C-x =3D C-a ( SPC C-f C-x =3D C-a C-f s=20 t m u l t i b y=20 t e - s t r i n g - p C-a C-f C-f C-f C-f t C-e ) C-j=20 C-p C-p C-p C-n C-f C-f C-f C-f C-f C-f C-f C-f C-f=20 C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f=20 C-f C-f C-f C-b C-b C-b C-f C-x =3D C-x 1 C-f C-f C-f=20 C-b C-k b C-k C-p C-p C-p C-p C-p C-p=20 C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p=20 C-p C-e C-b C-b C-b C-y C-k ) C-j C-p C-p C-e C-b C-b=20 C-b C-b C-d C-e C-j C-p C-p C-e C-b C-b C-b C-t C-e=20 C-j C-p C-p C-e C-x C-b C-x o C-n C-n C-n RET C-x 1=20 C-x b C-x b * s c C-n C-p C-n=20 C-n e n a b l e - m u l t i b y t e - c h a r a c t=20 e r s C-j C-x b C-p C-n RET C-v l C-a C-n=20 C-n C-n C-e C-x 2 C-x o C-x b * s c =20 C-g C-x C-b C-x o C-n C-n C-n C-n RET C-p=20 C-p C-p C-x o C-p C-p C-a C-n C-SPC C-n C-n C-n C-n=20 w x r e p o r t - e m a c s - b u=20 g s C-g x r e p o r t - e m a c s -=20 b u g Recent messages: insert-file-contents-literally: Opening input file: no such file or direc= tory, /Sources/emacs/nextstep/Emacs.app/Contents/Resources/etc/DOC-24.1.5= 0.1 Mark set Char: =C3=A4 (228, #o344, #xe4, file ...) point=3D499 of 612 (81%) column= =3D1 [2 times] Char: DEL (127, #o177, #x7f) point=3D466 of 623 (75%) column=3D3 Char: =C3=A4 (228, #o344, #xe4, file ...) point=3D466 of 625 (74%) column= =3D3 Char: DEL (127, #o177, #x7f) point=3D486 of 647 (75%) column=3D23 Mark set Quit byte-code: Beginning of buffer [2 times] Mark set Quit Load-path shadows: None found. Features: (shadow sort gnus-util mail-extr emacsbug message format-spec rfc822 mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils find-func vc-git cc-mode cc-fonts cc-guess cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs mule-util multi-isearch info help-mode easymenu view help-fns byte-opt warnings cl compile comint ansi-color ring bytecomp byte-compile cconv macroexp vc-hg time-date tooltip ediff-hook vc-hooks lisp-float-type mwheel ns-win tool-bar dnd fontset image regexp-opt fringe lisp-mode register page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core frame cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer loaddefs button faces cus-face files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote make-network-process dbusbind ns multi-tty emacs) --=20 =E2=80=98Iodine deficiency was endemic in parts of the UK until, through = what has been described as =E2=80=9Can unplanned and accidental public health triumph=E2= =80=9D, iodine was added to cattle feed to improve milk production in the 1930s.=E2=80=99 (EN Pearce, Lancet, June 2011) From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 07 12:24:48 2020 Received: (at 11309) by debbugs.gnu.org; 7 Dec 2020 17:24:48 +0000 Received: from localhost ([127.0.0.1]:55385 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmKFj-00051n-Mf for submit@debbugs.gnu.org; Mon, 07 Dec 2020 12:24:47 -0500 Received: from quimby.gnus.org ([95.216.78.240]:39972) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmKFg-00051X-Ve for 11309@debbugs.gnu.org; Mon, 07 Dec 2020 12:24:46 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID :In-Reply-To:Date:References:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=EG83S2z2L6fVJ/hIgE3ZrtaAq6+bIbivHZilz2j7RXE=; b=Oe4U5Xw1ReyR/IgHe4ZWGwysOc hWIi7JpQLOsa7AgZVTZbAC+wsuyzigcJ/9QSVg+0198kfetyxoz7Zc/p7b2tWMY8qxl6JzL9Kd+d+ v5vr2etC9abKodD3jLUZJkhon+nvtGMiHy8kXeq67RRqYMpevf7GT80nVhOuDwtrobwQ=; Received: from cm-84.212.202.86.getinternet.no ([84.212.202.86] helo=xo) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kmKFX-0008Bm-LA; Mon, 07 Dec 2020 18:24:38 +0100 From: Lars Ingebrigtsen To: Aidan Kehoe Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek References: <4f93d952.287f22e03@parhasard.net> X-Now-Playing: Sylvan Esso's _WITH_: "The Glow" Date: Mon, 07 Dec 2020 18:24:34 +0100 In-Reply-To: <4f93d952.287f22e03@parhasard.net> (Aidan Kehoe's message of "Sun, 22 Apr 2012 11:11:30 +0100") Message-ID: <87blf5cywd.fsf@gnus.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: Aidan Kehoe writes: > (let ((case-fold-search t)) > (string-match "[[:upper:]]" "a\u0686")) > => 0 ;; As documented > > (upcase "\u0430") ;; CYRILLIC SMALL LETTER A > => "А" ;; "\u0410", so it's in the case table > > (l [...] Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 11309 Cc: 11309@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Aidan Kehoe writes: > (let ((case-fold-search t)) > (string-match "[[:upper:]]" "a\u0686")) > =3D> 0 ;; As documented > > (upcase "\u0430") ;; CYRILLIC SMALL LETTER A > =3D> "=D0=90" ;; "\u0410", so it's in the case table > > (let ((case-fold-search t)) > (string-match "[[:upper:]]" "\u0430\u0686")) > =3D> nil ;; Ah, this is unexpected. I tried this in Emacs 28, and I can confirm that this behaviour is still present. > (let ((case-fold-search t)) > (string-match "[[:lower:]]" "\u0410\u0686")) > =3D> 0 ;; But this works as documented.=20 > > (upcase "\u03b2") ;; GREEK SMALL LETTER BETA > =3D> "=CE=92" ;; "\u0392", it's in the case table > > (let ((case-fold-search t)) > (string-match "[[:upper:]]" "\u03b2\u5357")) > =3D> nil ;; Oops > > (let ((case-fold-search t)) > (string-match "[[:lower:]]" "\u0392\u5357")) > =3D> 0 ;; But this works, again.=20 And this, too. Anybody have any insight here? --=20 (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 07 17:14:54 2020 Received: (at 11309) by debbugs.gnu.org; 7 Dec 2020 22:14:54 +0000 Received: from localhost ([127.0.0.1]:55760 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmOmU-0003iU-2y for submit@debbugs.gnu.org; Mon, 07 Dec 2020 17:14:54 -0500 Received: from mail70c50.megamailservers.eu ([91.136.10.80]:40440) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmOmR-0003iH-Lq for 11309@debbugs.gnu.org; Mon, 07 Dec 2020 17:14:52 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1607379289; bh=77tTjjdmQqq4ebq9VvSjM6ssYTPWZ9MthlTWrOFwKdU=; h=From:Subject:Date:Cc:To:From; b=kw0NXj/5zwF+VqXPfcMfZIbzTRiUSCclCOagH2L2HBHlGaVs1um+1Lt1JUs9c8uk+ T60qOp3aTmBV9NDGVbO0PR9lEaubrMMGRrsu9dL26iahfAYRFyeKWPnHTtudoVah5w AVTmLVA54mykIgLKmN7M2jCBarnqCQy1j1cwmjrE= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail70c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 0B7MEk62024145; Mon, 7 Dec 2020 22:14:48 +0000 From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek Message-Id: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> Date: Mon, 7 Dec 2020 23:14:45 +0100 To: Lars Ingebrigtsen , Aidan Kehoe X-Mailer: Apple Mail (2.3445.104.17) X-CTCH-RefID: str=0001.0A782F29.5FCEA959.004A, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=UIOj4xXy c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=vFieRUaNK8m44jR1g9EA:9 a=CjuIK1q_8ugA:10 X-Origin-Country: SE X-Spam-Score: 3.0 (+++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Not surprising in the least given the broken logic: ((class_bits & BIT_UPPER) && (ISUPPER (c) || (corig != c && c == downcase (corig) && ISLOWER (c)))) || ((class_bits & BIT_LOWER) && (ISLOWER (c) || (corig != c && c == upcase (corig) && ISUPPER(c)))) [...] Content analysis details: (3.0 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 2.0 FAKE_REPLY_B No description available. X-Debbugs-Envelope-To: 11309 Cc: 11309@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 2.0 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Not surprising in the least given the broken logic: ((class_bits & BIT_UPPER) && (ISUPPER (c) || (corig != c && c == downcase (corig) && ISLOWER (c)))) || ((class_bits & BIT_LOWER) && (ISLOWER (c) || (corig != c && c == upcase (corig) && ISUPPER(c)))) [...] Content analysis details: (2.0 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -1.0 MAILING_LIST_MULTI Multiple indicators imply a widely-seen list manager 2.0 FAKE_REPLY_B No description available. Not surprising in the least given the broken logic: ((class_bits & BIT_UPPER) && (ISUPPER (c) || (corig !=3D c && c =3D=3D downcase (corig) && ISLOWER (c)))) = || ((class_bits & BIT_LOWER) && (ISLOWER (c) || (corig !=3D c && c =3D=3D upcase (corig) && ISUPPER(c)))) || where corig is the character being matched and c is corig after = canonicalising, which appears to mean downcasing in practice. This means that the second case (BIT_LOWER means [:lower:]) works more = or less as intended (by accident) but the [:upper:] case is less lucky = and doesn't, as observed. ASCII characters aren't affected by this bug since they are handled by a = separate bitmap. This has probably never worked properly. From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 08 09:48:55 2020 Received: (at 11309) by debbugs.gnu.org; 8 Dec 2020 14:48:55 +0000 Received: from localhost ([127.0.0.1]:57082 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmeIR-00058y-A0 for submit@debbugs.gnu.org; Tue, 08 Dec 2020 09:48:55 -0500 Received: from mail1459c50.megamailservers.eu ([91.136.14.59]:44638 helo=mail267c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmeIO-00058d-26; Tue, 08 Dec 2020 09:48:53 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1607438925; bh=vjkJeSQrrNFHKt583t39lptN9SdEvsMQXP0VyDgluCA=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=dsLQ2VsZ0y88sBP8OqKzbc5TdogVlXiV52wJmXpFHyCFzySSrWCbbHnIYL1fpP9gR wix3mT26pkgga77V69w5jcRqZnHj/2OgMBrr3cnvjjhThwVNhP3Fd00Vw8ND2lx2m5 VH+gtjxxs+EDdRbuyQhzqmL5naHjhSYcslclzjK8= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail267c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 0B8Emgcf003046; Tue, 8 Dec 2020 14:48:44 +0000 Content-Type: multipart/mixed; boundary="Apple-Mail=_E1F560C0-0F50-4308-BB0B-0FC29B2B5797" Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> Date: Tue, 8 Dec 2020 15:48:42 +0100 Message-Id: <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> References: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> To: Lars Ingebrigtsen , Aidan Kehoe X-Mailer: Apple Mail (2.3445.104.17) X-CTCH-RefID: str=0001.0A782F18.5FCF924D.005D, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=arrM9hRV c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=M51BFTxLslgA:10 a=3wM6RYQi0-KIhEu6XfMA:9 a=QEXdDO2ut3YA:10 a=kqviqhIUa05yftmTVB0A:9 a=B2y7HmGcmWMA:10 X-Origin-Country: SE X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: tags 11309 patch stop The attached patch should fix the bug for all characters except ß which still is not matched by [:lower:] nor by [:upper:] no matter the value of case-fold-search. The remaining problem seems to be that the upcase table maps ß to itself, which is wrong -- as long as we don't upcase ß to U+1E9E, it should not have an upcase table entry at all. I'll see what can [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 11309 Cc: 11309@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) --Apple-Mail=_E1F560C0-0F50-4308-BB0B-0FC29B2B5797 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 tags 11309 patch stop The attached patch should fix the bug for all characters except =C3=9F = which still is not matched by [:lower:] nor by [:upper:] no matter the = value of case-fold-search. The remaining problem seems to be that the upcase table maps =C3=9F to = itself, which is wrong -- as long as we don't upcase =C3=9F to U+1E9E, = it should not have an upcase table entry at all. I'll see what can be = done about that. --Apple-Mail=_E1F560C0-0F50-4308-BB0B-0FC29B2B5797 Content-Disposition: attachment; filename=0001-Fix-upper-and-lower-for-Unicode-characters-bug-11309.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-Fix-upper-and-lower-for-Unicode-characters-bug-11309.patch" Content-Transfer-Encoding: quoted-printable =46rom=20aead9bce8351477ee29d03d419a8c896a22aec4c=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Tue,=208=20Dec=202020=2012:47:58=20+0100=0A= Subject:=20[PATCH]=20Fix=20[:upper:]=20and=20[:lower:]=20for=20Unicode=20= characters=0A=20(bug#11309)=0A=0A*=20src/regex-emacs.c=20= (execute_charset):=20Add=20canon_table=20argument=20to=0Aallow=20= expression=20of=20a=20correct=20predicate=20for=20[:upper:]=20and=20= [:lower:].=0A(mutually_exclusive_p,=20re_match_2_internal):=20Pass=20= extra=20argument.=0A*=20test/src/regex-emacs-tests.el=20= (regexp-case-fold,=20regexp-eszett):=0ANew=20tests.=20=20Parts=20of=20= regexp-eszett=20still=20fail=20and=20are=20commented=20out.=0A---=0A=20= src/regex-emacs.c=20=20=20=20=20=20=20=20=20=20=20=20=20|=2017=20= ++++++-----=0A=20test/src/regex-emacs-tests.el=20|=2057=20= +++++++++++++++++++++++++++++++++++=0A=202=20files=20changed,=2066=20= insertions(+),=208=20deletions(-)=0A=0Adiff=20--git=20= a/src/regex-emacs.c=20b/src/regex-emacs.c=0Aindex=20= 971a5f6374..6b5dded8e5=20100644=0A---=20a/src/regex-emacs.c=0A+++=20= b/src/regex-emacs.c=0A@@=20-3575,9=20+3575,11=20@@=20skip_noops=20= (re_char=20*p,=20re_char=20*pend)=0A=20=20=20=20opcode.=20=20When=20the=20= function=20finishes,=20*PP=20will=20be=20advanced=20past=20that=20= opcode.=0A=20=20=20=20C=20is=20character=20to=20test=20(possibly=20after=20= translations)=20and=20CORIG=20is=20original=0A=20=20=20=20character=20= (i.e.=20without=20any=20translations).=20=20UNIBYTE=20denotes=20whether=20= c=20is=0A-=20=20=20unibyte=20or=20multibyte=20character.=20*/=0A+=20=20=20= unibyte=20or=20multibyte=20character.=0A+=20=20=20CANON_TABLE=20is=20the=20= canonicalisation=20table=20for=20case=20folding=20or=20Qnil.=20=20*/=0A=20= static=20bool=0A-execute_charset=20(re_char=20**pp,=20int=20c,=20int=20= corig,=20bool=20unibyte)=0A+execute_charset=20(re_char=20**pp,=20int=20= c,=20int=20corig,=20bool=20unibyte,=0A+=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20Lisp_Object=20canon_table)=0A=20{=0A=20=20=20eassume=20(0=20= <=3D=20c=20&&=200=20<=3D=20corig);=0A=20=20=20re_char=20*p=20=3D=20*pp,=20= *rtp=20=3D=20NULL;=0A@@=20-3617,11=20+3619,9=20@@=20execute_charset=20= (re_char=20**pp,=20int=20c,=20int=20corig,=20bool=20unibyte)=0A=20=20=20=20= =20=20=20=20=20=20=20(class_bits=20&=20BIT_BLANK=20&&=20ISBLANK=20(c))=20= ||=0A=20=09=20=20(class_bits=20&=20BIT_WORD=20=20&&=20ISWORD=20=20(c))=20= ||=0A=20=09=20=20((class_bits=20&=20BIT_UPPER)=20&&=0A-=09=20=20=20= (ISUPPER=20(c)=20||=20(corig=20!=3D=20c=20&&=0A-=09=09=09=20=20=20=20c=20= =3D=3D=20downcase=20(corig)=20&&=20ISLOWER=20(c))))=20||=0A+=09=20=20=20= (ISUPPER=20(corig)=20||=20(canon_table=20!=3D=20Qnil=20&&=20ISLOWER=20= (corig))))=20||=0A=20=09=20=20((class_bits=20&=20BIT_LOWER)=20&&=0A-=09=20= =20=20(ISLOWER=20(c)=20||=20(corig=20!=3D=20c=20&&=0A-=09=09=09=20=20=20=20= c=20=3D=3D=20upcase=20(corig)=20&&=20ISUPPER(c))))=20||=0A+=09=20=20=20= (ISLOWER=20(corig)=20||=20(canon_table=20!=3D=20Qnil=20&&=20ISUPPER=20= (corig))))=20||=0A=20=09=20=20(class_bits=20&=20BIT_PUNCT=20&&=20ISPUNCT=20= (c))=20||=0A=20=09=20=20(class_bits=20&=20BIT_GRAPH=20&&=20ISGRAPH=20= (c))=20||=0A=20=09=20=20(class_bits=20&=20BIT_PRINT=20&&=20ISPRINT=20= (c)))=0A@@=20-3696,7=20+3696,8=20@@=20mutually_exclusive_p=20(struct=20= re_pattern_buffer=20*bufp,=20re_char=20*p1,=0A=20=09else=20if=20= ((re_opcode_t)=20*p1=20=3D=3D=20charset=0A=20=09=09=20||=20(re_opcode_t)=20= *p1=20=3D=3D=20charset_not)=0A=20=09=20=20{=0A-=09=20=20=20=20if=20= (!execute_charset=20(&p1,=20c,=20c,=20!multibyte=20||=20ASCII_CHAR_P=20= (c)))=0A+=09=20=20=20=20if=20(!execute_charset=20(&p1,=20c,=20c,=20= !multibyte=20||=20ASCII_CHAR_P=20(c),=0A+=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= Qnil))=0A=20=09=20=20=20=20=20=20{=0A=20=09=09DEBUG_PRINT=20("=09=20No=20= match=20=3D>=20fast=20loop.\n");=0A=20=09=09return=20true;=0A@@=20= -4367,7=20+4368,7=20@@=20re_match_2_internal=20(struct=20= re_pattern_buffer=20*bufp,=0A=20=09=20=20=20=20=20=20}=0A=20=0A=20=09=20=20= =20=20p=20-=3D=201;=0A-=09=20=20=20=20if=20(!execute_charset=20(&p,=20c,=20= corig,=20unibyte_char))=0A+=09=20=20=20=20if=20(!execute_charset=20(&p,=20= c,=20corig,=20unibyte_char,=20translate))=0A=20=09=20=20=20=20=20=20goto=20= fail;=0A=20=0A=20=09=20=20=20=20d=20+=3D=20len;=0Adiff=20--git=20= a/test/src/regex-emacs-tests.el=20b/test/src/regex-emacs-tests.el=0A= index=20f9372e37b1..576630aa5a=20100644=0A---=20= a/test/src/regex-emacs-tests.el=0A+++=20b/test/src/regex-emacs-tests.el=0A= @@=20-803,4=20+803,61=20@@=20regexp-multibyte-unibyte=0A=20=20=20= (should-not=20(string-match=20"=C3=A5"=20"\xe5"))=0A=20=20=20(should-not=20= (string-match=20"[=C3=A5]"=20"\xe5")))=0A=20=0A+(ert-deftest=20= regexp-case-fold=20()=0A+=20=20"Test=20case-sensitive=20and=20= case-insensitive=20matching."=0A+=20=20(let=20((case-fold-search=20nil))=0A= +=20=20=20=20(should=20(equal=20(string-match=20"aB"=20"ABaB")=202))=0A+=20= =20=20=20(should=20(equal=20(string-match=20"=C3=A5=C3=84"=20= "=C3=85=C3=A4=C3=A5=C3=A4=C3=85=C3=84=C3=A5=C3=84")=206))=0A+=20=20=20=20= (should=20(equal=20(string-match=20"=CE=BB=CE=9B"=20"l=CE=9B=CE=BB=CE=BB=CE= =9B")=203))=0A+=20=20=20=20(should=20(equal=20(string-match=20"=D1=88=D0=A8= "=20"z=D0=A8=D1=88=D1=88=D0=A8")=203))=0A+=20=20=20=20(should=20(equal=20= (string-match=20"[[:alpha:]]+"=20".3aB=C3=A5=C3=84=C3=9F=CE=BB=CE=9B=D1=88= =D0=A8=E4=B8=AD=EF=B7=BD")=202))=0A+=20=20=20=20(should=20(equal=20= (match-end=200)=2012))=0A+=20=20=20=20(should=20(equal=20(string-match=20= "[[:alnum:]]+"=20".3aB=C3=A5=C3=84=C3=9F=CE=BB=CE=9B=D1=88=D0=A8=E4=B8=AD=EF= =B7=BD")=201))=0A+=20=20=20=20(should=20(equal=20(match-end=200)=2012))=0A= +=20=20=20=20(should=20(equal=20(string-match=20"[[:upper:]]+"=20= ".3a=C3=A5=CE=BB=D1=88B=C3=84=CE=9B=D0=A8=E4=B8=AD=EF=B7=BD")=206))=0A+=20= =20=20=20(should=20(equal=20(match-end=200)=2010))=0A+=20=20=20=20= (should=20(equal=20(string-match=20"[[:lower:]]+"=20= ".3B=C3=84=CE=9B=D0=A8a=C3=A5=CE=BB=D1=88=E4=B8=AD=EF=B7=BD")=206))=0A+=20= =20=20=20(should=20(equal=20(match-end=200)=2010)))=0A+=20=20(let=20= ((case-fold-search=20t))=0A+=20=20=20=20(should=20(equal=20(string-match=20= "aB"=20"ABaB")=200))=0A+=20=20=20=20(should=20(equal=20(string-match=20= "=C3=A5=C3=84"=20"=C3=85=C3=A4=C3=A5=C3=A4=C3=85=C3=84=C3=A5=C3=84")=20= 0))=0A+=20=20=20=20(should=20(equal=20(string-match=20"=CE=BB=CE=9B"=20= "l=CE=9B=CE=BB=CE=BB=CE=9B")=201))=0A+=20=20=20=20(should=20(equal=20= (string-match=20"=D1=88=D0=A8"=20"z=D0=A8=D1=88=D1=88=D0=A8")=201))=0A+=20= =20=20=20(should=20(equal=20(string-match=20"[[:alpha:]]+"=20= ".3aB=C3=A5=C3=84=C3=9F=CE=BB=CE=9B=D1=88=D0=A8=E4=B8=AD=EF=B7=BD")=20= 2))=0A+=20=20=20=20(should=20(equal=20(match-end=200)=2012))=0A+=20=20=20= =20(should=20(equal=20(string-match=20"[[:alnum:]]+"=20= ".3aB=C3=A5=C3=84=C3=9F=CE=BB=CE=9B=D1=88=D0=A8=E4=B8=AD=EF=B7=BD")=20= 1))=0A+=20=20=20=20(should=20(equal=20(match-end=200)=2012))=0A+=20=20=20= =20(should=20(equal=20(string-match=20"[[:upper:]]+"=20= ".3a=C3=A5=CE=BB=D1=88B=C3=84=CE=9B=D0=A8=E4=B8=AD=EF=B7=BD")=202))=0A+=20= =20=20=20(should=20(equal=20(match-end=200)=2010))=0A+=20=20=20=20= (should=20(equal=20(string-match=20"[[:lower:]]+"=20= ".3B=C3=84=CE=9B=D0=A8a=C3=A5=CE=BB=D1=88=E4=B8=AD=EF=B7=BD")=202))=0A+=20= =20=20=20(should=20(equal=20(match-end=200)=2010))))=0A+=0A+(ert-deftest=20= regexp-eszett=20()=0A+=20=20"Test=20matching=20of=20=C3=9F=20and=20=E1=BA=9E= ."=0A+=20=20;;=20=C3=9F=20is=20a=20lower-case=20letter=20(Ll);=20=E1=BA=9E= =20is=20an=20upper-case=20letter=20(Lu).=0A+=20=20(let=20= ((case-fold-search=20nil))=0A+=20=20=20=20(should=20(equal=20= (string-match=20"=C3=9F"=20"=C3=9F")=200))=0A+=20=20=20=20(should=20= (equal=20(string-match=20"=C3=9F"=20"=E1=BA=9E")=20nil))=0A+=20=20=20=20= (should=20(equal=20(string-match=20"=E1=BA=9E"=20"=C3=9F")=20nil))=0A+=20= =20=20=20(should=20(equal=20(string-match=20"=E1=BA=9E"=20"=E1=BA=9E")=20= 0))=0A+=20=20=20=20(should=20(equal=20(string-match=20"[[:alpha:]]"=20= "=C3=9F")=200))=0A+=20=20=20=20;;=20bug#11309=0A+=20=20=20=20;;(should=20= (equal=20(string-match=20"[[:lower:]]"=20"=C3=9F")=200))=0A+=20=20=20=20= ;;(should=20(equal=20(string-match=20"[[:upper:]]"=20"=C3=9F")=20nil))=0A= +=20=20=20=20(should=20(equal=20(string-match=20"[[:alpha:]]"=20"=E1=BA=9E= ")=200))=0A+=20=20=20=20(should=20(equal=20(string-match=20"[[:lower:]]"=20= "=E1=BA=9E")=20nil))=0A+=20=20=20=20(should=20(equal=20(string-match=20= "[[:upper:]]"=20"=E1=BA=9E")=200)))=0A+=20=20(let=20((case-fold-search=20= t))=0A+=20=20=20=20(should=20(equal=20(string-match=20"=C3=9F"=20"=C3=9F")= =200))=0A+=20=20=20=20(should=20(equal=20(string-match=20"=C3=9F"=20= "=E1=BA=9E")=200))=0A+=20=20=20=20(should=20(equal=20(string-match=20= "=E1=BA=9E"=20"=C3=9F")=200))=0A+=20=20=20=20(should=20(equal=20= (string-match=20"=E1=BA=9E"=20"=E1=BA=9E")=200))=0A+=20=20=20=20(should=20= (equal=20(string-match=20"[[:alpha:]]"=20"=C3=9F")=200))=0A+=20=20=20=20= ;;=20bug#11309=0A+=20=20=20=20;;(should=20(equal=20(string-match=20= "[[:lower:]]"=20"=C3=9F")=200))=0A+=20=20=20=20;;(should=20(equal=20= (string-match=20"[[:upper:]]"=20"=C3=9F")=200))=0A+=20=20=20=20(should=20= (equal=20(string-match=20"[[:alpha:]]"=20"=E1=BA=9E")=200))=0A+=20=20=20=20= (should=20(equal=20(string-match=20"[[:lower:]]"=20"=E1=BA=9E")=200))=0A= +=20=20=20=20(should=20(equal=20(string-match=20"[[:upper:]]"=20"=E1=BA=9E= ")=200))))=0A+=0A=20;;;=20regex-emacs-tests.el=20ends=20here=0A--=20=0A= 2.21.1=20(Apple=20Git-122.3)=0A=0A= --Apple-Mail=_E1F560C0-0F50-4308-BB0B-0FC29B2B5797-- From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 08 11:02:21 2020 Received: (at 11309) by debbugs.gnu.org; 8 Dec 2020 16:02:22 +0000 Received: from localhost ([127.0.0.1]:59302 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmfRV-0007KR-NH for submit@debbugs.gnu.org; Tue, 08 Dec 2020 11:02:21 -0500 Received: from eggs.gnu.org ([209.51.188.92]:35872) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmfRU-0007Fm-5w for 11309@debbugs.gnu.org; Tue, 08 Dec 2020 11:02:20 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:53147) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kmfRN-00088z-WE; Tue, 08 Dec 2020 11:02:14 -0500 Received: from [176.228.60.248] (port=3462 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kmfRN-00056B-6m; Tue, 08 Dec 2020 11:02:13 -0500 Date: Tue, 08 Dec 2020 18:02:05 +0200 Message-Id: <83ft4g70ci.fsf@gnu.org> From: Eli Zaretskii To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> (message from Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Tue, 8 Dec 2020 15:48:42 +0100) Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek References: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 11309 Cc: kehoea@parhasard.net, larsi@gnus.org, 11309@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Mattias Engdegård > Date: Tue, 8 Dec 2020 15:48:42 +0100 > Cc: 11309@debbugs.gnu.org > > The remaining problem seems to be that the upcase table maps ß to itself, which is wrong -- as long as we don't upcase ß to U+1E9E, it should not have an upcase table entry at all. I'll see what can be done about that. Why is this a problem? AFAIR characters that don't have an upper-case form map to themselves when downcased. E.g. (upcase ?1) => ?1 Why should ß violate this convention? > * src/regex-emacs.c (execute_charset): Add canon_table argument to > allow expression of a correct predicate for [:upper:] and [:lower:]. > (mutually_exclusive_p, re_match_2_internal): Pass extra argument. > * test/src/regex-emacs-tests.el (regexp-case-fold, regexp-eszett): > New tests. Parts of regexp-eszett still fail and are commented out. Thanks, LGTM. From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 08 11:10:30 2020 Received: (at 11309) by debbugs.gnu.org; 8 Dec 2020 16:10:30 +0000 Received: from localhost ([127.0.0.1]:59312 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmfZN-0007qd-PH for submit@debbugs.gnu.org; Tue, 08 Dec 2020 11:10:29 -0500 Received: from mail-out.m-online.net ([212.18.0.9]:47809) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmfZK-0007qS-1W for 11309@debbugs.gnu.org; Tue, 08 Dec 2020 11:10:28 -0500 Received: from frontend01.mail.m-online.net (unknown [192.168.8.182]) by mail-out.m-online.net (Postfix) with ESMTP id 4Cr4sD3Mwlz1qsjq; Tue, 8 Dec 2020 17:10:24 +0100 (CET) Received: from localhost (dynscan1.mnet-online.de [192.168.6.70]) by mail.m-online.net (Postfix) with ESMTP id 4Cr4sD2fWzz1qryR; Tue, 8 Dec 2020 17:10:24 +0100 (CET) X-Virus-Scanned: amavisd-new at mnet-online.de Received: from mail.mnet-online.de ([192.168.8.182]) by localhost (dynscan1.mail.m-online.net [192.168.6.70]) (amavisd-new, port 10024) with ESMTP id DuJu4pxaUD0I; Tue, 8 Dec 2020 17:10:23 +0100 (CET) X-Auth-Info: tbfpiLc4/qIo/8od5MHh/bJzsChRgsF4Bu9ZjD1vXGOXdS0yyqS+6M5uKvTAkNc0 Received: from igel.home (ppp-46-244-165-151.dynamic.mnet-online.de [46.244.165.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.mnet-online.de (Postfix) with ESMTPSA; Tue, 8 Dec 2020 17:10:23 +0100 (CET) Received: by igel.home (Postfix, from userid 1000) id 722282C32D8; Tue, 8 Dec 2020 17:10:22 +0100 (CET) From: Andreas Schwab To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek References: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> X-Yow: ..Wait 'til those ITALIAN TEENAGERS get back to their HONDAS & discover them to be FILLED to the BRIM with MAZOLA!! Date: Tue, 08 Dec 2020 17:10:22 +0100 In-Reply-To: <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> ("Mattias =?utf-8?Q?Engdeg=C3=A5rd=22's?= message of "Tue, 8 Dec 2020 15:48:42 +0100") Message-ID: <87o8j4cm8h.fsf@igel.home> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -0.4 (/) X-Debbugs-Envelope-To: 11309 Cc: Aidan Kehoe , Lars Ingebrigtsen , 11309@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.4 (-) On Dez 08 2020, Mattias Engdegård wrote: > diff --git a/src/regex-emacs.c b/src/regex-emacs.c > index 971a5f6374..6b5dded8e5 100644 > --- a/src/regex-emacs.c > +++ b/src/regex-emacs.c > @@ -3575,9 +3575,11 @@ skip_noops (re_char *p, re_char *pend) > opcode. When the function finishes, *PP will be advanced past that opcode. > C is character to test (possibly after translations) and CORIG is original > character (i.e. without any translations). UNIBYTE denotes whether c is > - unibyte or multibyte character. */ > + unibyte or multibyte character. > + CANON_TABLE is the canonicalisation table for case folding or Qnil. */ The function uses that only as a boolean, so why not pass it as that? Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 08 11:19:50 2020 Received: (at 11309) by debbugs.gnu.org; 8 Dec 2020 16:19:50 +0000 Received: from localhost ([127.0.0.1]:59343 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmfiP-00087B-B9 for submit@debbugs.gnu.org; Tue, 08 Dec 2020 11:19:49 -0500 Received: from mail1453c50.megamailservers.eu ([91.136.14.53]:35262 helo=mail266c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmfiN-00086v-IO for 11309@debbugs.gnu.org; Tue, 08 Dec 2020 11:19:48 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1607444381; bh=pEG9vvvme9JCMmcCSewfoDAKAmqvQlWpM7fpGB5uoCA=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=dRQUTXCVLO/JTJ4hNEVeCgydiRGCxYateZAbHd2d9gxfSPTrVHWy3yrzCT0ciwZmg reURR1wUmPiADpdM3CqsSs27JcvsdPjDUil8YIb/7tNqEvjoYu5R5Af6OYlKNvJiqY 3Q2UnWSwFBAVudOmZWWoPiFp4AjakYnQqXXse16M= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail266c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 0B8GJcZT016849; Tue, 8 Dec 2020 16:19:40 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <87o8j4cm8h.fsf@igel.home> Date: Tue, 8 Dec 2020 17:19:38 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> <87o8j4cm8h.fsf@igel.home> To: Andreas Schwab X-Mailer: Apple Mail (2.3445.104.17) X-CTCH-RefID: str=0001.0A782F1D.5FCFA79C.00FB, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=fuqim2wf c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=tBb2bbeoAAAA:8 a=CPSLs4qXiNHsOacdAukA:9 a=CjuIK1q_8ugA:10 a=Oj-tNtZlA1e06AYgeCfH:22 X-Origin-Country: SE X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 8 dec. 2020 kl. 17.10 skrev Andreas Schwab : > The function uses that only as a boolean, so why not pass it as that? Thanks for reading the patch! It's a micro-optimisation: passing it as a boolean would entail an unconditional comparison against Qnil, but it is only used for [:lower:] and [:upper:] which are used i [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 11309 Cc: Aidan Kehoe , Lars Ingebrigtsen , 11309@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 8 dec. 2020 kl. 17.10 skrev Andreas Schwab : > The function uses that only as a boolean, so why not pass it as that? Thanks for reading the patch! It's a micro-optimisation: passing it as a = boolean would entail an unconditional comparison against Qnil, but it is = only used for [:lower:] and [:upper:] which are used in a small fraction = of character alternatives. Maybe there is a cleaner way to do this = without making the code slower. From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 08 11:57:46 2020 Received: (at 11309) by debbugs.gnu.org; 8 Dec 2020 16:57:46 +0000 Received: from localhost ([127.0.0.1]:59410 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmgJ7-0004w4-R9 for submit@debbugs.gnu.org; Tue, 08 Dec 2020 11:57:46 -0500 Received: from mail1445c50.megamailservers.eu ([91.136.14.45]:34746 helo=mail265c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmgJ4-0004vK-JO for 11309@debbugs.gnu.org; Tue, 08 Dec 2020 11:57:45 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1607446655; bh=lxbwrR3+5O/f/U1ve51US0HAqXDkFIfk8k2RMuZqjHg=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=sLqwVNb3tyHW+DeTb4fwWw7hLuJtYlw6L6IwnejIXERFLBpbwNngolUjpeYQqh3Uw rcDMbnMCCVVJXyVk6IEnjSyW7nIy0Hi+c/kV+zUIVR4g/Zy+B0hgkeSgX9Y6IVfwUb mSDOXNVZ4kPtXgXI5KFvrg49WW6oyzvjE0i0G0Gs= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail265c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 0B8GvWlq017622; Tue, 8 Dec 2020 16:57:34 +0000 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <83ft4g70ci.fsf@gnu.org> Date: Tue, 8 Dec 2020 17:57:32 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <65B5A1CC-9D3D-495B-951D-733C9C0B355E@acm.org> References: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> <83ft4g70ci.fsf@gnu.org> To: Eli Zaretskii X-Mailer: Apple Mail (2.3445.104.17) X-CTCH-RefID: str=0001.0A782F1E.5FCFB07F.002B, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=QoAgIm6d c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=IkcTkHD0fZMA:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=Qs5LAhLigy5ksFFOCM8A:9 a=QEXdDO2ut3YA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Origin-Country: SE X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: 8 dec. 2020 kl. 17.02 skrev Eli Zaretskii : > AFAIR characters that don't have an upper-case > form map to themselves when downcased. E.g. > > (upcase ?1) => ?1 This is not about the Lisp (upcase x) function but the C upcase(x) function, which uses the upcase table directly. They affect the uppercasep and lowercasep functions which are used in the regexp engi [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 11309 Cc: kehoea@parhasard.net, larsi@gnus.org, 11309@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 8 dec. 2020 kl. 17.02 skrev Eli Zaretskii : > AFAIR characters that don't have an upper-case > form map to themselves when downcased. E.g. >=20 > (upcase ?1) =3D> ?1 This is not about the Lisp (upcase x) function but the C upcase(x) = function, which uses the upcase table directly. They affect the uppercasep and lowercasep functions which are used in = the regexp engine. Thus we get uppercasep(=C3=9F)=3Dlowercasep(=C3=9F)=3Df= alse which is wrong. The logic of 'lowercasep' may need to be changed because its use of = upcase and downcase which return their argument if the respective table = has no entry for it. Let's see what can be done. From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 08 12:02:06 2020 Received: (at 11309) by debbugs.gnu.org; 8 Dec 2020 17:02:06 +0000 Received: from localhost ([127.0.0.1]:59425 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmgNJ-0006oH-Kx for submit@debbugs.gnu.org; Tue, 08 Dec 2020 12:02:05 -0500 Received: from mail-wm1-f42.google.com ([209.85.128.42]:54600) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmgNF-0006gN-LE for 11309@debbugs.gnu.org; Tue, 08 Dec 2020 12:02:03 -0500 Received: by mail-wm1-f42.google.com with SMTP id d3so2616906wmb.4 for <11309@debbugs.gnu.org>; Tue, 08 Dec 2020 09:02:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tcd-ie.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-transfer-encoding; bh=chj6hdOfXbktlTx3uURxyo14iEw+eAcVkRIV2t2WlSA=; b=U6DBtBvEufHz4Fd+vu2V68iVLLOfo5vyYTkmjXtvi+6vk5GvWiGKlZja7KY23wuDaG fPn27idTh5vpc+JshX1onIWiGQwZtI8BV715GjLrhLyqOwGnBssRTku618J3w6gmeHbu 7f170BelNP88V4V4Xv4/bfvbCkl3aY84EHgxAr43vWVwmXa5TbqDzSuyjCc/UoCLkt+n IBCH1zO0qE/r56M3TsFTkl1Nwi8LQGC808//Dhx/7gW3VIJE6zOnQzRYxykYB2hCNt8p Z3VEARQGqUkDrZqM/cWkCzoajn9TZlS5k4YuSvvTihUgfi/H6pXpjQomt9sPrEtUmqf9 GLyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version:content-transfer-encoding; bh=chj6hdOfXbktlTx3uURxyo14iEw+eAcVkRIV2t2WlSA=; b=FPBqGQ/zcSBGarscAT8IbtBB+08ghAaz5KEQtEF5WJEG5rvHURhw8FcJ4aFKoiVHnS wcyO5c+0jAUETRHpAFELDukXsuSpD0CXwtoCYdCccT5HuSXBy1v9fS3iVe8w1Cg4fvKb 2azx/01HPDGIsEbwUWZ+MbSZASu32ipnBh2rLpbG+mo0kXHJ73x9urRaGuo/nAQzp5ux wwerskkSbEKn+apGKzhc07Mo1O1hFR421s8Dyy8sZkqJ9SojBBzA7cFdy3JkAog1TDey ZyQQJe/81hPaXICB9xOYMKxyV0W5sbRP73YSWyjxy1GKATS+O+b+KYIv5M/U3IRJuFMQ LJ8A== X-Gm-Message-State: AOAM5316v6q9gAY0i/gv/zDFVFDx6YulM+qeGM+odlD9NE7ykjHLdqTO yAGsFHHU/i7iZhB2gHQI337Hwg== X-Google-Smtp-Source: ABdhPJxnwUlkt9xdln1IWuWv5XHHLkLPJl49uopcgIXhBw+Rjh/YaCLr7fde4OzyBsJP+p3B9Mm1TQ== X-Received: by 2002:a7b:c770:: with SMTP id x16mr4657081wmk.139.1607446915725; Tue, 08 Dec 2020 09:01:55 -0800 (PST) Received: from localhost ([2a02:8084:20e2:c380:92bd:1bfd:38fc:fae2]) by smtp.gmail.com with ESMTPSA id b18sm22548715wrt.54.2020.12.08.09.01.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Dec 2020 09:01:54 -0800 (PST) From: "Basil L. Contovounesios" To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek References: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> Date: Tue, 08 Dec 2020 17:01:53 +0000 In-Reply-To: <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> ("Mattias =?utf-8?Q?Engdeg=C3=A5rd=22's?= message of "Tue, 8 Dec 2020 15:48:42 +0100") Message-ID: <87im9cz0xq.fsf@tcd.ie> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 11309 Cc: Aidan Kehoe , Lars Ingebrigtsen , 11309@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Mattias Engdeg=C3=A5rd writes: > @@ -3617,11 +3619,9 @@ execute_charset (re_char **pp, int c, int corig, b= ool unibyte) > (class_bits & BIT_BLANK && ISBLANK (c)) || > (class_bits & BIT_WORD && ISWORD (c)) || > ((class_bits & BIT_UPPER) && > - (ISUPPER (c) || (corig !=3D c && > - c =3D=3D downcase (corig) && ISLOWER (c)))) || > + (ISUPPER (corig) || (canon_table !=3D Qnil && ISLOWER (corig)))) || > ((class_bits & BIT_LOWER) && > - (ISLOWER (c) || (corig !=3D c && > - c =3D=3D upcase (corig) && ISUPPER(c)))) || > + (ISLOWER (corig) || (canon_table !=3D Qnil && ISUPPER (corig)))) || Just curious: why not NILP? Thanks, --=20 Basil From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 08 12:04:11 2020 Received: (at 11309) by debbugs.gnu.org; 8 Dec 2020 17:04:12 +0000 Received: from localhost ([127.0.0.1]:59434 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmgPL-0007FI-N9 for submit@debbugs.gnu.org; Tue, 08 Dec 2020 12:04:11 -0500 Received: from mail1477c50.megamailservers.eu ([91.136.14.77]:56778 helo=mail118c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmgPJ-0007F3-B9 for 11309@debbugs.gnu.org; Tue, 08 Dec 2020 12:04:10 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1607447042; bh=I2Uojk0rG/eWwTPkucLspAFiSUaoLZwMQPOLRsml+5s=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=duDCgXTU5ZboQmpOEvpZSdhLjtPcIdU1JQkyjVfWqXOK9QnYV+QjzkBuqGUmjri5y A+Jyq3z1xMlixPWrkmBSwetS6NIIUTpGONzJDPEfd6qFiiYXgap5soEf+B73zoeuls 7gDxrokUXCvOrEEdmy5t/sVEX4clCMoZ2VZiIM6w= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail118c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 0B8H402U002319; Tue, 8 Dec 2020 17:04:02 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <87im9cz0xq.fsf@tcd.ie> Date: Tue, 8 Dec 2020 18:04:00 +0100 Content-Transfer-Encoding: 7bit Message-Id: References: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> <87im9cz0xq.fsf@tcd.ie> To: "Basil L. Contovounesios" X-Mailer: Apple Mail (2.3445.104.17) X-CTCH-RefID: str=0001.0A782F19.5FCFB202.00BD, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=HYRqsRM8 c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=kj9zAlcOel0A:10 a=M51BFTxLslgA:10 a=Cy30xtZ3CQNrdbn0MFUA:9 a=CjuIK1q_8ugA:10 a=BhrJvqtl3CMA:10 X-Origin-Country: SE X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 11309 Cc: Aidan Kehoe , Lars Ingebrigtsen , 11309@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 8 dec. 2020 kl. 18.01 skrev Basil L. Contovounesios : > Just curious: why not NILP? Momentary amnesia. Will change, thank you! From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 08 12:06:17 2020 Received: (at 11309) by debbugs.gnu.org; 8 Dec 2020 17:06:17 +0000 Received: from localhost ([127.0.0.1]:59442 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmgRN-0007J3-As for submit@debbugs.gnu.org; Tue, 08 Dec 2020 12:06:17 -0500 Received: from eggs.gnu.org ([209.51.188.92]:49832) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kmgRM-0007Io-1O for 11309@debbugs.gnu.org; Tue, 08 Dec 2020 12:06:16 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:54409) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kmgRE-00060s-KW; Tue, 08 Dec 2020 12:06:08 -0500 Received: from [176.228.60.248] (port=3388 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kmgR5-0003ww-Vj; Tue, 08 Dec 2020 12:06:07 -0500 Date: Tue, 08 Dec 2020 19:05:53 +0200 Message-Id: <83zh2o5itq.fsf@gnu.org> From: Eli Zaretskii To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <65B5A1CC-9D3D-495B-951D-733C9C0B355E@acm.org> (message from Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Tue, 8 Dec 2020 17:57:32 +0100) Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek References: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> <83ft4g70ci.fsf@gnu.org> <65B5A1CC-9D3D-495B-951D-733C9C0B355E@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 11309 Cc: kehoea@parhasard.net, larsi@gnus.org, 11309@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Mattias Engdegård > Date: Tue, 8 Dec 2020 17:57:32 +0100 > Cc: larsi@gnus.org, kehoea@parhasard.net, 11309@debbugs.gnu.org > > This is not about the Lisp (upcase x) function but the C upcase(x) function, which uses the upcase table directly. > They affect the uppercasep and lowercasep functions which are used in the regexp engine. Thus we get uppercasep(ß)=lowercasep(ß)=false which is wrong. Why is it wrong, and what practical problems does this cause? > The logic of 'lowercasep' may need to be changed because its use of upcase and downcase which return their argument if the respective table has no entry for it. Let's see what can be done. I don't want us to change the logic of such basic functions for the benefit of a single obscure character. Let's first see what problems with this character we have in practice, and then discuss what is the best way of solving those problems. TIA From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 09 09:37:33 2020 Received: (at 11309-done) by debbugs.gnu.org; 9 Dec 2020 14:37:33 +0000 Received: from localhost ([127.0.0.1]:33205 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kn0ay-0000KD-ST for submit@debbugs.gnu.org; Wed, 09 Dec 2020 09:37:33 -0500 Received: from mail1450c50.megamailservers.eu ([91.136.14.50]:36940 helo=mail265c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kn0av-0000Jc-To for 11309-done@debbugs.gnu.org; Wed, 09 Dec 2020 09:37:31 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1607524643; bh=g91YUNLmCu12bec30ljU01NFxaneByCRdSxTmHhRMNQ=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=jFJZ7Z2zHDsiGLpAAPIyyAs6WkP4qJnOWxvWPrkszk1Gj7NBwmd47mI0j76mDFdTx WjjKONpOtWnP/DxSDExxCxaj1fn81EwnJiv+/7GsU7KG9Ntiv/W5PxuhfrSiBIf7nU wzLNAWm7/jRuT5VHFDevuDw4hStes/gAapQBR5Wk= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail265c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 0B9EbK58029886; Wed, 9 Dec 2020 14:37:22 +0000 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <83zh2o5itq.fsf@gnu.org> Date: Wed, 9 Dec 2020 15:37:19 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <28B85957-B8DB-431D-A120-F17D8AE4693F@acm.org> References: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> <83ft4g70ci.fsf@gnu.org> <65B5A1CC-9D3D-495B-951D-733C9C0B355E@acm.org> <83zh2o5itq.fsf@gnu.org> To: Eli Zaretskii X-Mailer: Apple Mail (2.3445.104.17) X-CTCH-RefID: str=0001.0A782F19.5FD0E123.000E, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=QoAgIm6d c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=IkcTkHD0fZMA:10 a=M51BFTxLslgA:10 a=41QRUTk52hwInh9ES8IA:9 a=QEXdDO2ut3YA:10 X-Origin-Country: SE X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Eli, thanks for looking at the patch, now pushed to master (with Basil's suggested tweak). > Why is it wrong, and what practical problems does this cause? ß is a lower case letter so lowercasep(ß)=false is wrong. As a consequence, matching ß with [:lower:] and [:upper:] don't work correctly: ß should be matched by [:lower:] when case-fold-search is [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 11309-done Cc: Aidan Kehoe , Lars Ingebrigtsen , 11309-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Eli, thanks for looking at the patch, now pushed to master (with Basil's = suggested tweak). > Why is it wrong, and what practical problems does this cause? =C3=9F is a lower case letter so lowercasep(=C3=9F)=3Dfalse is wrong. As = a consequence, matching =C3=9F with [:lower:] and [:upper:] don't work = correctly: =C3=9F should be matched by [:lower:] when case-fold-search = is nil, and by both [:lower:] and [:upper:] when case-fold-search is = non-nil. The problem stems from the fact that uppercasep and lowercasep don't use = the Unicode case information directly (which perhaps they should) but = derive the case indirectly from the upcase and downcase tables, and = there is no way to state that a char is lower case but cannot be upcased = or downcased. (Below I'm going to use the notation T[C] for the table T = indexed by character C.) Currently, characters missing from or self-mapping in the upcase and = downcase tables are considered to be caseless. For instance, = upcase[*]=3Ddowncase[*]=3D* and upcase[=E4=B8=AD]=3Ddowncase[=E4=B8=AD]=3D= nil. However, we also have upcase[=C3=9F]=3Ddowncase[=C3=9F]=3D=C3=9F, = causing the incorrect lowercasep result. The solution that I ended up applying was the simplest possible: set = upcase[=C3=9F]=3D=E1=BA=9E (U+7838). The special-uppercase properties = ensure that (upcase "=C3=9F") =3D> "SS", and now all tests pass. (An acceptable alternative would have been to set upcase[=C3=9F]=3Dnil = and adapt lowercasep accordingly. I tried that and it works flawlessly, = but involves slightly more changes.) And that concludes the resolution of this bug. From debbugs-submit-bounces@debbugs.gnu.org Wed Dec 09 10:46:23 2020 Received: (at 11309-done) by debbugs.gnu.org; 9 Dec 2020 15:46:23 +0000 Received: from localhost ([127.0.0.1]:35784 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kn1fb-0002pq-1u for submit@debbugs.gnu.org; Wed, 09 Dec 2020 10:46:23 -0500 Received: from eggs.gnu.org ([209.51.188.92]:48904) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kn1fa-0002pe-5t for 11309-done@debbugs.gnu.org; Wed, 09 Dec 2020 10:46:22 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:49632) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kn1fT-0006oj-Jy; Wed, 09 Dec 2020 10:46:15 -0500 Received: from [176.228.60.248] (port=3262 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1kn1fS-0007jk-UG; Wed, 09 Dec 2020 10:46:15 -0500 Date: Wed, 09 Dec 2020 17:46:10 +0200 Message-Id: <83eejz56f1.fsf@gnu.org> From: Eli Zaretskii To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= In-Reply-To: <28B85957-B8DB-431D-A120-F17D8AE4693F@acm.org> (message from Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Wed, 9 Dec 2020 15:37:19 +0100) Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek References: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> <83ft4g70ci.fsf@gnu.org> <65B5A1CC-9D3D-495B-951D-733C9C0B355E@acm.org> <83zh2o5itq.fsf@gnu.org> <28B85957-B8DB-431D-A120-F17D8AE4693F@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 11309-done Cc: kehoea@parhasard.net, larsi@gnus.org, 11309-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Mattias Engdegård > Date: Wed, 9 Dec 2020 15:37:19 +0100 > Cc: Lars Ingebrigtsen , Aidan Kehoe , > 11309-done@debbugs.gnu.org > > ß is a lower case letter so lowercasep(ß)=false is wrong. As a consequence, matching ß with [:lower:] and [:upper:] don't work correctly: ß should be matched by [:lower:] when case-fold-search is nil, and by both [:lower:] and [:upper:] when case-fold-search is non-nil. > > The problem stems from the fact that uppercasep and lowercasep don't use the Unicode case information directly (which perhaps they should) but derive the case indirectly from the upcase and downcase tables, and there is no way to state that a char is lower case but cannot be upcased or downcased. (Below I'm going to use the notation T[C] for the table T indexed by character C.) > > Currently, characters missing from or self-mapping in the upcase and downcase tables are considered to be caseless. For instance, upcase[*]=downcase[*]=* and upcase[中]=downcase[中]=nil. However, we also have upcase[ß]=downcase[ß]=ß, causing the incorrect lowercasep result. > > The solution that I ended up applying was the simplest possible: set upcase[ß]=ẞ (U+7838). The special-uppercase properties ensure that (upcase "ß") => "SS", and now all tests pass. > > (An acceptable alternative would have been to set upcase[ß]=nil and adapt lowercasep accordingly. I tried that and it works flawlessly, but involves slightly more changes.) > > And that concludes the resolution of this bug. Thanks. From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 10 04:36:24 2020 Received: (at 11309) by debbugs.gnu.org; 10 Dec 2020 09:36:24 +0000 Received: from localhost ([127.0.0.1]:37040 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knIN6-0004On-J5 for submit@debbugs.gnu.org; Thu, 10 Dec 2020 04:36:24 -0500 Received: from mail1456c50.megamailservers.eu ([91.136.14.56]:41356 helo=mail266c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knIN4-0004OX-8w for 11309@debbugs.gnu.org; Thu, 10 Dec 2020 04:36:23 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1607592975; bh=Y1Emo36IeiMShY8W9WrUZdntbboXP1ccMSDG+rsQzuI=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=eVY/TLM7eR8DR78RElrQjENAsgbMiR/SWUi/lZ7Ge3BHS39qXz/bKMBk0CD5Tw+kC XeL7GJQ4DaUqL+2BaVI7iGibPsFCYBm5oaevElKKUXn0928YGmCSFrn1sRbM2J+vcx AHfieIPIHV/+t8LgLGT1PMKJGo7GQQbprT+9idC4= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail266c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 0BA9aCP7012798; Thu, 10 Dec 2020 09:36:14 +0000 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <28B85957-B8DB-431D-A120-F17D8AE4693F@acm.org> Date: Thu, 10 Dec 2020 10:36:12 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> <83ft4g70ci.fsf@gnu.org> <65B5A1CC-9D3D-495B-951D-733C9C0B355E@acm.org> <83zh2o5itq.fsf@gnu.org> <28B85957-B8DB-431D-A120-F17D8AE4693F@acm.org> To: Eli Zaretskii X-Mailer: Apple Mail (2.3445.104.17) X-CTCH-RefID: str=0001.0A782F1C.5FD1EC0F.005D, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=fuqim2wf c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=IkcTkHD0fZMA:10 a=M51BFTxLslgA:10 a=PhhdO1OwZ5-eRGUaGm8A:9 a=QEXdDO2ut3YA:10 X-Origin-Country: SE X-Spam-Score: 1.4 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: As it turns out I had completely forgotten about Fupcase with a character argument -- (upcase ?ß) previously returned ?ß but ?ẞ after the change -- which was caught by casefiddle-tests. Now, what [...] Content analysis details: (1.4 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.4 KHOP_HELO_FCRDNS Relay HELO differs from its IP's reverse DNS X-Debbugs-Envelope-To: 11309 Cc: Aidan Kehoe , Lars Ingebrigtsen , 11309@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) As it turns out I had completely forgotten about Fupcase with a = character argument -- (upcase ?=C3=9F) previously returned ?=C3=9F but = ?=E1=BA=9E after the change -- which was caught by casefiddle-tests. = Now, what to do about it? One solution would be the previous plan B: set upcase[=C3=9F]=3Dnil, = modify the uppercasep logic, and we will have (upcase ?=C3=9F) =3D> ?=C3=9F= again. However, I would argue that the current state is actually = preferable: Upcasing =C3=9F to =C3=9F never really makes sense. Words containing =C3=9F= are written with SS in upper case: gro=C3=9F -> GROSS - which is one = reason why the character-to-character use of Fupcase normally cannot be = used for text containing the letter. The capital =C3=9F, ?=E1=BA=9E, is = still not widely employed but one of its purposes is when it is = important to preserve the exact spelling of proper names when written in = all caps: Gau=C3=9F -> GAU=E1=BA=9E, not GAUSS. (I wouldn't be surprised = if this will eventually become the general convention for all text, but = we are getting ahead of society here.) For these reasons, I'm adapting casefiddle-tests and calling it a = feature. From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 10 09:18:16 2020 Received: (at 11309) by debbugs.gnu.org; 10 Dec 2020 14:18:16 +0000 Received: from localhost ([127.0.0.1]:37290 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knMls-0005AD-0l for submit@debbugs.gnu.org; Thu, 10 Dec 2020 09:18:16 -0500 Received: from eggs.gnu.org ([209.51.188.92]:39542) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knMlp-00059z-9X for 11309@debbugs.gnu.org; Thu, 10 Dec 2020 09:18:14 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:41263) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1knMlf-00049Z-Li; Thu, 10 Dec 2020 09:18:05 -0500 Received: from [176.228.60.248] (port=2549 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1knMld-0001tG-9w; Thu, 10 Dec 2020 09:18:03 -0500 Date: Thu, 10 Dec 2020 16:17:41 +0200 Message-Id: <83lfe54uey.fsf@gnu.org> From: Eli Zaretskii To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= In-Reply-To: (message from Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Thu, 10 Dec 2020 10:36:12 +0100) Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek References: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> <83ft4g70ci.fsf@gnu.org> <65B5A1CC-9D3D-495B-951D-733C9C0B355E@acm.org> <83zh2o5itq.fsf@gnu.org> <28B85957-B8DB-431D-A120-F17D8AE4693F@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 11309 Cc: kehoea@parhasard.net, larsi@gnus.org, 11309@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Mattias Engdegård > Date: Thu, 10 Dec 2020 10:36:12 +0100 > Cc: Lars Ingebrigtsen , Aidan Kehoe , > 11309@debbugs.gnu.org > > Upcasing ß to ß never really makes sense. Words containing ß are written with SS in upper case: groß -> GROSS - which is one reason why the character-to-character use of Fupcase normally cannot be used for text containing the letter. The capital ß, ?ẞ, is still not widely employed but one of its purposes is when it is important to preserve the exact spelling of proper names when written in all caps: Gauß -> GAUẞ, not GAUSS. (I wouldn't be surprised if this will eventually become the general convention for all text, but we are getting ahead of society here.) Wouldn't it be confusing that upcase treats ?ß and "ß" differently? From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 10 10:48:10 2020 Received: (at 11309) by debbugs.gnu.org; 10 Dec 2020 15:48:11 +0000 Received: from localhost ([127.0.0.1]:39106 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knOAs-0007uw-Et for submit@debbugs.gnu.org; Thu, 10 Dec 2020 10:48:10 -0500 Received: from mail173c50.megamailservers.eu ([91.136.10.183]:41204 helo=mail56c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knOAq-0007um-3j for 11309@debbugs.gnu.org; Thu, 10 Dec 2020 10:48:09 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1607615286; bh=4GPA+fMbXLi8LQ7nuNLdh9vCfFYJIfVYiSsLPbovSIE=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=mFEkEuPNoNgAABh70WDdNoLSuqKL4pPj0t6bLjBs9JEgswX0wXI67ptJ32kW38c9t +eqkIEdXUg/NoXXYKW0BmvWXPtmNzVsXuJOZUfR2U4QQyiMN05elMGsaCTNuS/H5w5 Sq7mcLxTIyrBZiv51gaPOQDKmvawo95ONP7RuOc0= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail56c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 0BAFm3u2000954; Thu, 10 Dec 2020 15:48:05 +0000 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <83lfe54uey.fsf@gnu.org> Date: Thu, 10 Dec 2020 16:48:03 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> <83ft4g70ci.fsf@gnu.org> <65B5A1CC-9D3D-495B-951D-733C9C0B355E@acm.org> <83zh2o5itq.fsf@gnu.org> <28B85957-B8DB-431D-A120-F17D8AE4693F@acm.org> <83lfe54uey.fsf@gnu.org> To: Eli Zaretskii X-Mailer: Apple Mail (2.3445.104.17) X-CTCH-RefID: str=0001.0A782F26.5FD24336.004A, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=L5BjvNb8 c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=IkcTkHD0fZMA:10 a=M51BFTxLslgA:10 a=mDV3o1hIAAAA:8 a=uB-1nI4wHfEIMBFjptoA:9 a=QEXdDO2ut3YA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Origin-Country: SE X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 11309 Cc: kehoea@parhasard.net, Lars Ingebrigtsen , 11309@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 10 dec. 2020 kl. 15.17 skrev Eli Zaretskii : > Wouldn't it be confusing that upcase treats ?=C3=9F and "=C3=9F" = differently? Well it already did so before (returning ?=C3=9F and "SS", respectively) = and it's not as if we have much of a choice since (1) upcase is documented to return a value of the same type as its = argument, and (2) "SS" is definitely the right return value for "=C3=9F". From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 10 10:53:23 2020 Received: (at 11309) by debbugs.gnu.org; 10 Dec 2020 15:53:23 +0000 Received: from localhost ([127.0.0.1]:39110 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knOFv-00082T-AB for submit@debbugs.gnu.org; Thu, 10 Dec 2020 10:53:23 -0500 Received: from quimby.gnus.org ([95.216.78.240]:53408) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knOFs-00082A-5q for 11309@debbugs.gnu.org; Thu, 10 Dec 2020 10:53:22 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID :In-Reply-To:Date:References:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=UTObPU2/PT6K9il63xKzU2ueq5JGjas3GjOgAe+nA0I=; b=Ai1e5Ffql9hB0O5GzeDdokpvef Wjb5lUUdxtd1yyZg7gChKL5IaMOhOT2ScQRrwMQh4/4lMhtU1hwsI3AGJfvi+ev+PJYmuH1W3JfSO snADcBqXLMqYlp8PH6RlLjtfi87cmVRtjccUHl3exdyO7h4BE400wmtTbBeK/Lem2w08=; Received: from cm-84.212.202.86.getinternet.no ([84.212.202.86] helo=xo) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1knOFh-0005Zl-1A; Thu, 10 Dec 2020 16:53:12 +0100 From: Lars Ingebrigtsen To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek References: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> <83ft4g70ci.fsf@gnu.org> <65B5A1CC-9D3D-495B-951D-733C9C0B355E@acm.org> <83zh2o5itq.fsf@gnu.org> <28B85957-B8DB-431D-A120-F17D8AE4693F@acm.org> <83lfe54uey.fsf@gnu.org> Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAABGdBTUEAALGPC/xhBQAAACBj SFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAAIVBMVEUxLSoODAsiHBkn IR1DPDZGQT1TTkpgXFdhXFhlYFv///9+DHe0AAAAAWJLR0QKaND0VgAAAAd0SU1FB+QMCg8yAuNu jDEAAAEMSURBVDjLvdA9T8MwEAbgcxMx5xwlc88gJDZHEcwUUHdUNbPjIs8VlZqdqSM/GSdKiclR lA7l3eJH95EDITXEL2vXpmkOh8Zn75yFAhGVKjWA92MgKoAQO1LXZVkCRGUbpaiHjjyqPoTfkJjE utkuFt0zBrC/Mq7+NKL/HsC1sNvSGNLtramKjyUyeHswlXi/4xW+1bLe5ILDvalqoxdjkKtUwNOz Zq3GuRzIAcJXSSfgR6spgKeAzgb8A4J1zz9JMFqF4ydsRUhK/l4RbkdT/+NfQDKQNhEwF6ZgFTZ7 BE2WGGxygJwsnwGZB2E4xBno9HV1PNwAN0mkcTHjFXO/lRIRHy7bs5O6wK2+AE/wac6W1LESAAAA JXRFWHRkYXRlOmNyZWF0ZQAyMDIwLTEyLTEwVDE1OjUwOjAyKzAwOjAw/AF/CAAAACV0RVh0ZGF0 ZTptb2RpZnkAMjAyMC0xMi0xMFQxNTo1MDowMiswMDowMI1cx7QAAAAASUVORK5CYII= X-Now-Playing: Boris's _Noise_: "Ghost of Romance" Date: Thu, 10 Dec 2020 16:53:07 +0100 In-Reply-To: ("Mattias =?utf-8?Q?Engdeg=C3=A5rd=22's?= message of "Thu, 10 Dec 2020 16:48:03 +0100") Message-ID: <87360dlkt8.fsf@gnus.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: Mattias Engdegård writes: > Well it already did so before (returning ?ß and "SS", respectively) > and it's not as if we have much of a choice since > (1) upcase is documented to return a value of the same type as its argument [...] Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 11309 Cc: kehoea@parhasard.net, Eli Zaretskii , 11309@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Mattias Engdeg=C3=A5rd writes: > Well it already did so before (returning ?=C3=9F and "SS", respectively) > and it's not as if we have much of a choice since > (1) upcase is documented to return a value of the same type as its argume= nt, and > (2) "SS" is definitely the right return value for "=C3=9F". I can only vaguely read German, but doesn't that depend one the locale? That is, whether an upcase of =C3=9F should be SS or =E1=BA=9E depends on..= . what time and place we're at? So returning either, or both (as after your patch), sounds fine to me -- it's an improvement on what Emacs did before. --=20 (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 04:18:12 2020 Received: (at 11309) by debbugs.gnu.org; 11 Dec 2020 09:18:12 +0000 Received: from localhost ([127.0.0.1]:39942 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kneZ1-0006NX-RV for submit@debbugs.gnu.org; Fri, 11 Dec 2020 04:18:12 -0500 Received: from mail200c50.megamailservers.eu ([91.136.10.210]:36150 helo=mail193c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1kneYy-0006NM-FK for 11309@debbugs.gnu.org; Fri, 11 Dec 2020 04:18:10 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1607678286; bh=05AJj91TmS0zNnagFdTlz4ML0jVvSf3E+Qz1i+3mEgw=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=eThp3HW84D80FK4Lxiv4HUaRipvBnYVeG40ydB5EpMtKw2WMpy7IDoW1xveDRiwYG zr0g3tsnLUnMYqY5zHs00j0+qMCPUqPRBOdKK3lVBq7SPCK6NSfnFqtChlXPnkmsZZ owy8cuYKI44s7Znb3luJR4N48kfecze8km/ekXSY= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c188-150-171-71.bredband.comhem.se [188.150.171.71]) (authenticated bits=0) by mail193c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id 0BB9I4Pm017842; Fri, 11 Dec 2020 09:18:05 +0000 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\)) Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <87360dlkt8.fsf@gnus.org> Date: Fri, 11 Dec 2020 10:18:03 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <2A7536AB-B160-4005-AE13-CBF60A73E370@acm.org> References: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> <83ft4g70ci.fsf@gnu.org> <65B5A1CC-9D3D-495B-951D-733C9C0B355E@acm.org> <83zh2o5itq.fsf@gnu.org> <28B85957-B8DB-431D-A120-F17D8AE4693F@acm.org> <83lfe54uey.fsf@gnu.org> <87360dlkt8.fsf@gnus.org> To: Lars Ingebrigtsen X-Mailer: Apple Mail (2.3445.104.17) X-CTCH-RefID: str=0001.0A782F19.5FD3394E.00BC, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=TYHoSiYh c=1 sm=1 tr=0 a=SF+I6pRkHZhrawxbOkkvaA==:117 a=SF+I6pRkHZhrawxbOkkvaA==:17 a=IkcTkHD0fZMA:10 a=M51BFTxLslgA:10 a=OocQHUDgAAAA:8 a=HiNvLcNmliDgOaIoKt8A:9 a=QEXdDO2ut3YA:10 a=xUZTl98r3Qw_uB5NK3jt:22 X-Origin-Country: SE X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 11309 Cc: kehoea@parhasard.net, Eli Zaretskii , 11309@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) 10 dec. 2020 kl. 16.53 skrev Lars Ingebrigtsen : > I can only vaguely read German, but doesn't that depend one the = locale? > That is, whether an upcase of =C3=9F should be SS or =E1=BA=9E depends = on... what > time and place we're at? I suppose, but upcasing to =E1=BA=9E is not standard practice (at least = not yet) in any German-speaking country. The Swiss prefer not using =C3=9F= at all and write ss instead, but that doesn't affect the = case-conversion rules. From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 11 10:26:46 2020 Received: (at 11309) by debbugs.gnu.org; 11 Dec 2020 15:26:46 +0000 Received: from localhost ([127.0.0.1]:42648 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knkJi-0008Bl-4K for submit@debbugs.gnu.org; Fri, 11 Dec 2020 10:26:46 -0500 Received: from quimby.gnus.org ([95.216.78.240]:35328) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1knkJg-0008BV-MT for 11309@debbugs.gnu.org; Fri, 11 Dec 2020 10:26:45 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID :In-Reply-To:Date:References:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=bcq6GBys5/3TPER+BVcK6yMv7cxmWtWsWKt/+5WqHVI=; b=DhDbKOHuxIxWVZb2MqTAPqym+P +jKuGrg1BgVMuGRW9i6iqx7+OscErVbLsGjnZeQpbNlfSEupDRkpifxBdcVUlSAOxFh+O16s7gyE1 GpKyc9sl6/k/Xd6J6dozBfYj5CfNTz7pK+GmN7OvWivQx8TXGwN4lyy1MiTFO4wRvHFw=; Received: from cm-84.212.202.86.getinternet.no ([84.212.202.86] helo=xo) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1knkJX-0003wv-8l; Fri, 11 Dec 2020 16:26:37 +0100 From: Lars Ingebrigtsen To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= Subject: Re: bug#11309: 24.1.50; Case problems with [:upper:] and Cyrillic, Greek References: <5D75AE9F-F1F7-4A7E-A135-0071E03369AA@acm.org> <70DAA5B7-B336-4E8E-A342-05BD46BC0472@acm.org> <83ft4g70ci.fsf@gnu.org> <65B5A1CC-9D3D-495B-951D-733C9C0B355E@acm.org> <83zh2o5itq.fsf@gnu.org> <28B85957-B8DB-431D-A120-F17D8AE4693F@acm.org> <83lfe54uey.fsf@gnu.org> <87360dlkt8.fsf@gnus.org> <2A7536AB-B160-4005-AE13-CBF60A73E370@acm.org> Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAABGdBTUEAALGPC/xhBQAAACBj SFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAAFVBMVEUPHloXGy87W1NV i3CZf2zQzr7///9e39abAAAAAWJLR0QGYWa4fQAAAAlwSFlzAAAuIwAALiMBeKU/dgAAAAd0SU1F B+QMCw8YL3FmewkAAAFmSURBVDjLzdJLYoQgDABQddq9Yab7EvAAEj0ACgfoCPe/SoOD33HRZVlp HvmgFMX1Kov/tAAuw1IC1O/hCpEB3sNkUvwMlcrxUy3lAyGvM7gYiVChPDXRTk/UkUHoDlC5aH2I Iwof5R6UC/YrOERDcADt3PMRx5EQdLsH7/zPV4wTog7tbqxbDN42cTI8bm/ns7yymhijbYiHIvEE 2ITjAQj1aDojcYPbDKg4RbV62OCTYRDEcSMM7kr1nGAZBhruCF4ukCo5ECqlCHzgCk1KAMENlJGk 5QqeQcKd+CsaUNitMM/EpSYaSSp0bYZqAcNd2kau436kYfmBwx1JAWuPdIr0fZBI07eQ9wX6DAIb Q2ZNgNR7BsB+oqEG2EGYd/F1UwMc4XWdSp4M6neASj1FUV4BGiwPpfJfqysl4RoQjmDzVQa4gmIP Ne/LUG+QL1U+3yUsb3+GIVfKh6tXsHnXGYI8Qo7/Ak/QawweYX13AAAAJXRFWHRkYXRlOmNyZWF0 ZQAyMDIwLTEyLTExVDE1OjI0OjQ3KzAwOjAwpMt6PQAAACV0RVh0ZGF0ZTptb2RpZnkAMjAyMC0x Mi0xMVQxNToyNDo0NyswMDowMNWWwoEAAAAASUVORK5CYII= X-Now-Playing: Tricky's _Fall to Pieces_: "Close Now" Date: Fri, 11 Dec 2020 16:26:33 +0100 In-Reply-To: <2A7536AB-B160-4005-AE13-CBF60A73E370@acm.org> ("Mattias =?utf-8?Q?Engdeg=C3=A5rd=22's?= message of "Fri, 11 Dec 2020 10:18:03 +0100") Message-ID: <877dpo9xee.fsf@gnus.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: Mattias Engdegård writes: > 10 dec. 2020 kl. 16.53 skrev Lars Ingebrigtsen : > >> I can only vaguely read German, but doesn't that depend one the locale? >> That is, whether an upcase of ß should be SS or ẞ [...] Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 11309 Cc: kehoea@parhasard.net, Eli Zaretskii , 11309@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Mattias Engdeg=C3=A5rd writes: > 10 dec. 2020 kl. 16.53 skrev Lars Ingebrigtsen : > >> I can only vaguely read German, but doesn't that depend one the locale? >> That is, whether an upcase of =C3=9F should be SS or =E1=BA=9E depends o= n... what >> time and place we're at? > > I suppose, but upcasing to =E1=BA=9E is not standard practice (at least n= ot > yet) in any German-speaking country. The Swiss prefer not using =C3=9F at > all and write ss instead, but that doesn't affect the case-conversion > rules. I thought I vaguely remembered somebody somewhere making =E1=BA=9E a standa= rd upcase, but it seems I remembered wrong. They only say that it's "also possible": "According to the council=E2=80=99s 2017 spelling manual: When writing the uppercase [of =C3=9F], write SS. It=E2=80=99s also possible to use the uppe= rcase =E1=BA=9E. Example: Stra=C3=9Fe =E2=80=94 STRASSE =E2=80=94 STRA=E1=BA=9EE" --=20 (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no From unknown Tue Aug 19 10:01:20 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sat, 09 Jan 2021 12:24:05 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator