From debbugs-submit-bounces@debbugs.gnu.org Sun Feb 24 13:40:55 2019 Received: (at submit) by debbugs.gnu.org; 24 Feb 2019 18:40:55 +0000 Received: from localhost ([127.0.0.1]:50476 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gxyhq-0006lV-M6 for submit@debbugs.gnu.org; Sun, 24 Feb 2019 13:40:54 -0500 Received: from eggs.gnu.org ([209.51.188.92]:50128) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gxyhn-0006lB-OZ for submit@debbugs.gnu.org; Sun, 24 Feb 2019 13:40:52 -0500 Received: from lists.gnu.org ([209.51.188.17]:41962) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gxyhh-0003rj-Pf for submit@debbugs.gnu.org; Sun, 24 Feb 2019 13:40:45 -0500 Received: from eggs.gnu.org ([209.51.188.92]:36267) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gxyhh-0006Co-07 for bug-gnu-emacs@gnu.org; Sun, 24 Feb 2019 13:40:45 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.1 required=5.0 tests=BAYES_50,RCVD_IN_DNSWL_LOW, URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gxyhe-0003oU-4u for bug-gnu-emacs@gnu.org; Sun, 24 Feb 2019 13:40:44 -0500 Received: from mail236c50.megamailservers.eu ([91.136.10.246]:38640 helo=mail56c50.megamailservers.eu) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gxyhb-0003kH-QV for bug-gnu-emacs@gnu.org; Sun, 24 Feb 2019 13:40:40 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1551033635; bh=urDqCTIuggRfGwibsYdDMYJuoSN0gfcpkjPtpKc16BA=; h=From:Subject:Date:To:From; b=K4ao5YXAYFTsonx/EZhg9vfC2i1M8exbL0G0dWQW7yCdYVeo5qEf1YyM3REz7pS80 DICdQkaSKVFujx+S+1gtqLe59aEwFObnL+MGuRXfp9T0VsCksKPUtxicmVDCpM28cW qFwPofw+WWOT3fbKxDTR+VnzB7DWE+cYtML7JEeE= Feedback-ID: mattiase@acm.or Received: from [192.168.1.65] (c-e636e253.032-75-73746f71.bbcust.telenor.se [83.226.54.230]) (authenticated bits=0) by mail56c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id x1OIeXlS011293 for ; Sun, 24 Feb 2019 18:40:35 +0000 From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: rx: (or ...) order unpredictable Message-Id: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> Date: Sun, 24 Feb 2019 19:40:33 +0100 To: bug-gnu-emacs@gnu.org X-Mailer: Apple Mail (2.3445.102.3) X-CTCH-RefID: str=0001.0A0B0205.5C72E523.0017, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=EarmvsuC c=1 sm=1 tr=0 a=M+GU/qJco4WXjv8D6jB2IA==:117 a=M+GU/qJco4WXjv8D6jB2IA==:17 a=kj9zAlcOel0A:10 a=jTTxrkOn4n9Vm1INi-MA:9 a=CjuIK1q_8ugA:10 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] X-Received-From: 91.136.10.246 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) The rx (or ...) construct sometimes reorders its subexpressions, which = makes its semantics unpredictable. For example, (rx (or "ab" "a") (or "a" "ab")) =3D> "\\(?:ab?\\)\\(?:ab?\\)" The user reasonably expects (or e1 e2) to translate to E1\|E2, where ei = translates to Ei, or a semantic equivalent. Not having this control = makes rx useless or dangerous for many purposes. The reason for the reordering is the use of regex-opt behind the scenes. = Whether rx is the place to do this kind of optimisation is a matter of = opinion; mine is that it belongs in the regexp engine, together with = other, more aggressive optimisations (DFA, native-code generation, etc) = could be performed as well. We could determine whether any string is a prefix of another. If not, = regexp-opt should be safe to call. Alternatively, this check could be = done in regexp-opt (activated by a flag). That would be my preferred = short-term solution. (Speaking of regexp-opt, it has another bug that does not affect rx: it = returns the empty string if given an empty list of strings. The correct = return value is a regexp that never matches anything. Fix it, document = it, or turn it into an error?) From debbugs-submit-bounces@debbugs.gnu.org Sun Feb 24 14:06:48 2019 Received: (at 34641) by debbugs.gnu.org; 24 Feb 2019 19:06:48 +0000 Received: from localhost ([127.0.0.1]:50508 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gxz6u-0007OM-5l for submit@debbugs.gnu.org; Sun, 24 Feb 2019 14:06:48 -0500 Received: from eggs.gnu.org ([209.51.188.92]:55415) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gxz6s-0007O6-DQ for 34641@debbugs.gnu.org; Sun, 24 Feb 2019 14:06:46 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:45993) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gxz6d-0007fK-OJ; Sun, 24 Feb 2019 14:06:34 -0500 Received: from [176.228.60.248] (port=3304 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1gxz6b-00034n-5U; Sun, 24 Feb 2019 14:06:30 -0500 Date: Sun, 24 Feb 2019 21:06:34 +0200 Message-Id: <83bm31ngzp.fsf@gnu.org> From: Eli Zaretskii To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= In-reply-to: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> (message from Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Sun, 24 Feb 2019 19:40:33 +0100) Subject: Re: bug#34641: rx: (or ...) order unpredictable References: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 34641 Cc: 34641@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > From: Mattias Engdegård > Date: Sun, 24 Feb 2019 19:40:33 +0100 > > We could determine whether any string is a prefix of another. If not, regexp-opt should be safe to call. Alternatively, this check could be done in regexp-opt (activated by a flag). That would be my preferred short-term solution. Your preferred solution is fine with me, FWIW. > (Speaking of regexp-opt, it has another bug that does not affect rx: it returns the empty string if given an empty list of strings. The correct return value is a regexp that never matches anything. Fix it, document it, or turn it into an error?) Fix it, I think. From debbugs-submit-bounces@debbugs.gnu.org Sun Feb 24 16:18:37 2019 Received: (at 34641) by debbugs.gnu.org; 24 Feb 2019 21:18:37 +0000 Received: from localhost ([127.0.0.1]:50582 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gy1AS-0002D5-EM for submit@debbugs.gnu.org; Sun, 24 Feb 2019 16:18:37 -0500 Received: from mail83c50.megamailservers.eu ([91.136.10.93]:48360 helo=mail18c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gy1AO-0002Cu-Cd for 34641@debbugs.gnu.org; Sun, 24 Feb 2019 16:18:35 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1551043110; bh=jcmG0ip79i27746FgmmjAApS1DsS6T57q5A9yr80o0g=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=RehSo6IivdelKFKtyKme5ys85SrT4s1zPHMm8+/Hjw1PfLbfWVKeKOKAPDXCIKfu7 RiMK37DW+gcP4pr5RgGVk3s7eqLQzaWUR3FK3MO4sUYY/8G15opT5J3l7s3gFSKayD E39hivQqrI14Jn9a8y8NVdd9NTN/HxxHtSF6fJmE= Feedback-ID: mattiase@acm.or Received: from [192.168.1.65] (c-e636e253.032-75-73746f71.bbcust.telenor.se [83.226.54.230]) (authenticated bits=0) by mail18c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id x1OLIShQ001665; Sun, 24 Feb 2019 21:18:30 +0000 From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Message-Id: <065957BB-1332-458B-8757-742A81CED4A5@acm.org> Content-Type: multipart/mixed; boundary="Apple-Mail=_D5A358D7-0A23-41FF-BD17-E7237413679F" Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: bug#34641: rx: (or ...) order unpredictable Date: Sun, 24 Feb 2019 22:18:28 +0100 In-Reply-To: <83bm31ngzp.fsf@gnu.org> To: Eli Zaretskii References: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> <83bm31ngzp.fsf@gnu.org> X-Mailer: Apple Mail (2.3445.102.3) X-CTCH-RefID: str=0001.0A0B020F.5C730A26.0031, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=KOR08mNo c=1 sm=1 tr=0 a=M+GU/qJco4WXjv8D6jB2IA==:117 a=M+GU/qJco4WXjv8D6jB2IA==:17 a=mDV3o1hIAAAA:8 a=VuAzZmDrsQfTbhHelMgA:9 a=CjuIK1q_8ugA:10 a=DU3o0j1wsCfdOjnseeUA:9 a=B2y7HmGcmWMA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 34641 Cc: 34641@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --Apple-Mail=_D5A358D7-0A23-41FF-BD17-E7237413679F Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii 24 feb. 2019 kl. 20.06 skrev Eli Zaretskii : >=20 > Your preferred solution is fine with me, FWIW. Thank you; patch attached. >> (Speaking of regexp-opt, it has another bug that does not affect rx: = it returns the empty string if given an empty list of strings. The = correct return value is a regexp that never matches anything. Fix it, = document it, or turn it into an error?) >=20 > Fix it, I think. I'll prepare another patch. Is there a preferred or particularly clever = never-matching regexp? If not, would \(?:$\)A do? --Apple-Mail=_D5A358D7-0A23-41FF-BD17-E7237413679F Content-Disposition: attachment; filename=0001-rx-fix-or-ordering-by-adding-argument-to-regexp-opt.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-rx-fix-or-ordering-by-adding-argument-to-regexp-opt.patch" Content-Transfer-Encoding: quoted-printable =46rom=200ba8d5e51714519d818c519581e699ca82047e66=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Sun,=2024=20Feb=202019=2022:12:52=20+0100=0A= Subject:=20[PATCH]=20rx:=20fix=20`or'=20ordering=20by=20adding=20= argument=20to=20regexp-opt=0A=0AThe=20rx=20`or'=20form=20may=20reorder=20= its=20arguments=20in=20an=20unpredictable=20way,=0Acontrary=20to=20user=20= expectation,=20since=20it=20sometimes=20uses=20`regexp-opt'.=0AAdd=20a=20= NOREORDER=20option=20to=20`regexp-opt'=20for=20preventing=20it=20from=0A= producing=20a=20reordered=20regexp=20(Bug#34641).=0A=0A*=20= doc/lispref/searching.texi=20(Regular=20Expression=20Functions):=0A*=20= etc/NEWS=20(Lisp=20Changes=20in=20Emacs=2027.1):=0ADescribe=20the=20new=20= regexp-opt=20NOREORDER=20argument.=0A*=20lisp/emacs-lisp/regexp-opt.el=20= (regexp-opt):=20Add=20NOREORDER.=0AMake=20no=20attempt=20at=20regexp=20= improvement=20if=20the=20set=20of=20strings=20contains=0Aa=20prefix=20of=20= another=20string.=0A(regexp-opt--contains-prefix):=20New.=0A*=20= lisp/emacs-lisp/rx.el=20(rx-or):=20Call=20regexp-opt=20with=20NOREORDER.=0A= *=20test/lisp/emacs-lisp/rx-tests.el:=20Test=20rx=20`or'=20form=20match=20= order.=0A---=0A=20doc/lispref/searching.texi=20=20=20=20=20=20=20|=2012=20= ++++++++---=0A=20etc/NEWS=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20|=20=207=20++++++=0A=20= lisp/emacs-lisp/regexp-opt.el=20=20=20=20|=2037=20= ++++++++++++++++++++++++++++----=0A=20lisp/emacs-lisp/rx.el=20=20=20=20=20= =20=20=20=20=20=20=20|=20=202=20+-=0A=20test/lisp/emacs-lisp/rx-tests.el=20= |=2013=20+++++++++++=0A=205=20files=20changed,=2063=20insertions(+),=208=20= deletions(-)=0A=0Adiff=20--git=20a/doc/lispref/searching.texi=20= b/doc/lispref/searching.texi=0Aindex=20cfbd2449b1..73a7304a3b=20100644=0A= ---=20a/doc/lispref/searching.texi=0A+++=20b/doc/lispref/searching.texi=0A= @@=20-950,7=20+950,7=20@@=20whitespace:=0A=20@end=20defun=0A=20=0A=20= @cindex=20optimize=20regexp=0A-@defun=20regexp-opt=20strings=20&optional=20= paren=0A+@defun=20regexp-opt=20strings=20&optional=20paren=20noreorder=0A= =20This=20function=20returns=20an=20efficient=20regular=20expression=20= that=20will=20match=0A=20any=20of=20the=20strings=20in=20the=20list=20= @var{strings}.=20=20This=20is=20useful=20when=20you=0A=20need=20to=20= make=20matching=20or=20searching=20as=20fast=20as=20possible---for=20= example,=0A@@=20-985,8=20+985,14=20@@=20if=20it=20is=20necessary=20to=20= ensure=20that=20a=20postfix=20operator=20appended=20to=0A=20it=20will=20= apply=20to=20the=20whole=20expression.=0A=20@end=20table=0A=20=0A-The=20= resulting=20regexp=20of=20@code{regexp-opt}=20is=20equivalent=20to=20but=20= usually=0A-more=20efficient=20than=20that=20of=20a=20simplified=20= version:=0A+The=20optional=20argument=20@var{noreorder},=20if=20= @code{nil},=20allows=20the=0A+returned=20regexp=20to=20match=20the=20= strings=20in=20any=20order.=20=20If=20non-@code{nil},=0A+the=20regexp=20= is=20equivalent=20to=20a=20chain=20of=20alternatives=20(by=20the=20= @samp{\|}=0A+operator)=20of=20the=20strings=20in=20the=20order=20given.=0A= +=0A+Up=20to=20reordering,=20the=20resulting=20regexp=20of=20= @code{regexp-opt}=20is=0A+equivalent=20to=20but=20usually=20more=20= efficient=20than=20that=20of=20a=20simplified=0A+version:=0A=20=0A=20= @example=0A=20(defun=20simplified-regexp-opt=20(strings=20&optional=20= paren)=0Adiff=20--git=20a/etc/NEWS=20b/etc/NEWS=0Aindex=20= 67e376d9b3..de9a8defbd=20100644=0A---=20a/etc/NEWS=0A+++=20b/etc/NEWS=0A= @@=20-1614,6=20+1614,13=20@@=20given=20frame=20supports=20resizing.=0A=20= This=20is=20currently=20supported=20on=20GNUish=20hosts=20and=20on=20= modern=20versions=20of=0A=20MS-Windows.=0A=20=0A++++=0A+**=20The=20= function=20'regexp-opt'=20accepts=20an=20additional=20optional=20= argument.=0A+By=20default,=20the=20regexp=20returned=20by=20'regexp-opt'=20= may=20match=20the=20strings=0A+in=20any=20order.=20=20If=20the=20new=20= third=20argument=20is=20non-nil,=20the=20match=20is=0A+guaranteed=20to=20= be=20performed=20in=20the=20order=20given,=20as=20if=20the=20strings=20= were=0A+made=20into=20a=20regexp=20by=20joining=20them=20with=20'\|'.=0A= +=0A=20=0C=0A=20*=20Changes=20in=20Emacs=2027.1=20on=20Non-Free=20= Operating=20Systems=0A=20=0Adiff=20--git=20= a/lisp/emacs-lisp/regexp-opt.el=20b/lisp/emacs-lisp/regexp-opt.el=0A= index=20152dca2309..33a5b770a0=20100644=0A---=20= a/lisp/emacs-lisp/regexp-opt.el=0A+++=20b/lisp/emacs-lisp/regexp-opt.el=0A= @@=20-84,7=20+84,7=20@@=0A=20;;;=20Code:=0A=20=0A=20;;;###autoload=0A= -(defun=20regexp-opt=20(strings=20&optional=20paren)=0A+(defun=20= regexp-opt=20(strings=20&optional=20paren=20noreorder)=0A=20=20=20= "Return=20a=20regexp=20to=20match=20a=20string=20in=20the=20list=20= STRINGS.=0A=20Each=20string=20should=20be=20unique=20in=20STRINGS=20and=20= should=20not=20contain=0A=20any=20regexps,=20quoted=20or=20not.=20=20= Optional=20PAREN=20specifies=20how=20the=0A@@=20-111,8=20+111,13=20@@=20= nil=0A=20=20=20=20=20necessary=20to=20ensure=20that=20a=20postfix=20= operator=20appended=20to=20it=20will=0A=20=20=20=20=20apply=20to=20the=20= whole=20expression.=0A=20=0A-The=20resulting=20regexp=20is=20equivalent=20= to=20but=20usually=20more=20efficient=0A-than=20that=20of=20a=20= simplified=20version:=0A+The=20optional=20argument=20NOREORDER,=20if=20= nil,=20allows=20the=20returned=0A+regexp=20to=20match=20the=20strings=20= in=20any=20order.=20=20If=20non-nil,=20the=20regexp=0A+is=20equivalent=20= to=20a=20chain=20of=20alternatives=20(by=20the=20`\\|'=20operator)=0A+of=20= the=20strings=20in=20the=20order=20given.=0A+=0A+Up=20to=20reordering,=20= the=20resulting=20regexp=20is=20equivalent=20to=20but=0A+usually=20more=20= efficient=20than=20that=20of=20a=20simplified=20version:=0A=20=0A=20=20= (defun=20simplified-regexp-opt=20(strings=20&optional=20paren)=0A=20=20=20= =20(let=20((parens=0A@@=20-133,7=20+138,15=20@@=20than=20that=20of=20a=20= simplified=20version:=0A=20=09=20=20=20(open=20(cond=20((stringp=20= paren)=20paren)=20(paren=20"\\(")))=0A=20=09=20=20=20(sorted-strings=20= (delete-dups=0A=20=09=09=09=20=20=20=20(sort=20(copy-sequence=20strings)=20= 'string-lessp)))=0A-=09=20=20=20(re=20(regexp-opt-group=20sorted-strings=20= (or=20open=20t)=20(not=20open))))=0A+=09=20=20=20(re=0A+=20=20=20=20=20=20= =20=20=20=20=20=20;;=20If=20NOREORDER=20is=20non-nil=20and=20the=20list=20= contains=20a=20prefix=0A+=20=20=20=20=20=20=20=20=20=20=20=20;;=20of=20= another=20string,=20we=20give=20up=20all=20attempts=20at=20optimisation.=0A= +=20=20=20=20=20=20=20=20=20=20=20=20;;=20There=20is=20plenty=20of=20= room=20for=20improvement=20(Bug#34641).=0A+=20=20=20=20=20=20=20=20=20=20= =20=20(if=20(and=20noreorder=20(regexp-opt--contains-prefix=20= sorted-strings))=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= (concat=20(or=20open=20"\\(?:")=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20(mapconcat=20#'regexp-quote=20strings=20= "\\|")=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20"\\)")=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20= (regexp-opt-group=20sorted-strings=20(or=20open=20t)=20(not=20open)))))=0A= =20=20=20=20=20=20=20(cond=20((eq=20paren=20'words)=0A=20=09=20=20=20=20=20= (concat=20"\\<"=20re=20"\\>"))=0A=20=09=20=20=20=20((eq=20paren=20= 'symbols)=0A@@=20-313,6=20+326,22=20@@=20CHARS=20should=20be=20a=20list=20= of=20characters."=0A=20=20=20=20=20=20=20=20=20=20=20(concat=20"["=20= dash=20caret=20"]"))=0A=20=20=20=20=20=20=20(concat=20"["=20bracket=20= charset=20caret=20dash=20"]"))))=0A=20=0A+=0A+(defun=20= regexp-opt--contains-prefix=20(strings)=0A+=20=20"Whether=20a=20list=20= of=20strings=20contains=20a=20proper=20prefix=20of=20one=20of=20its=20= elements.=0A+STRINGS=20must=20be=20sorted=20and=20free=20from=20= duplicates."=0A+=20=20(let=20((s=20strings))=0A+=20=20=20=20;;=20In=20a=20= lexicographically=20sorted=20list,=20a=20string=20always=20immediately=0A= +=20=20=20=20;;=20succeeds=20one=20of=20its=20prefixes.=0A+=20=20=20=20= (while=20(and=20(cdr=20s)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20(not=20(string-equal=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20(car=20s)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20(substring=20(cadr=20s)=200=20(min=20(length=20= (car=20s))=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20(length=20(cadr=20s)))))))=0A+=20=20=20=20=20=20(setq=20s=20= (cdr=20s)))=0A+=20=20=20=20(cdr=20s)))=0A+=0A+=0A=20(provide=20= 'regexp-opt)=0A=20=0A=20;;;=20regexp-opt.el=20ends=20here=0Adiff=20--git=20= a/lisp/emacs-lisp/rx.el=20b/lisp/emacs-lisp/rx.el=0Aindex=20= 715cd608c4..ca756efb49=20100644=0A---=20a/lisp/emacs-lisp/rx.el=0A+++=20= b/lisp/emacs-lisp/rx.el=0A@@=20-393,7=20+393,7=20@@=20FORM=20is=20of=20= the=20form=20`(and=20FORM1=20...)'."=0A=20=20=20(rx-group-if=0A=20=20=20=20= (if=20(memq=20nil=20(mapcar=20'stringp=20(cdr=20form)))=0A=20=20=20=20=20= =20=20=20(mapconcat=20(lambda=20(x)=20(rx-form=20x=20'|))=20(cdr=20form)=20= "\\|")=0A-=20=20=20=20=20(regexp-opt=20(cdr=20form)))=0A+=20=20=20=20=20= (regexp-opt=20(cdr=20form)=20nil=20t))=0A=20=20=20=20(and=20(memq=20= rx-parent=20'(:=20*=20t))=20rx-parent)))=0A=20=0A=20=0Adiff=20--git=20= a/test/lisp/emacs-lisp/rx-tests.el=20b/test/lisp/emacs-lisp/rx-tests.el=0A= index=20e14feda347..fa3d9b0d5e=20100644=0A---=20= a/test/lisp/emacs-lisp/rx-tests.el=0A+++=20= b/test/lisp/emacs-lisp/rx-tests.el=0A@@=20-92,5=20+92,18=20@@=0A=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20(*?=20= "e")=20(+?=20"f")=20(\??=20"g")=20(??=20"h"))))=0A=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20"a*b+c?d?e*?f+?g??h??")))=0A=20=0A= +(ert-deftest=20rx-or=20()=0A+=20=20;;=20Test=20or-pattern=20reordering=20= (Bug#34641).=0A+=20=20(let=20((s=20"abc"))=0A+=20=20=20=20(should=20= (equal=20(and=20(string-match=20(rx=20(or=20"abc"=20"ab"=20"a"))=20s)=0A= +=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= (match-string=200=20s))=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20"abc"))=0A+=20=20=20=20(should=20(equal=20(and=20(string-match=20= (rx=20(or=20"ab"=20"abc"=20"a"))=20s)=0A+=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20(match-string=200=20s))=0A+=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20"ab"))=0A+=20=20=20=20= (should=20(equal=20(and=20(string-match=20(rx=20(or=20"a"=20"ab"=20= "abc"))=20s)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20(match-string=200=20s))=0A+=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20"a"))))=0A+=0A=20(provide=20'rx-tests)=0A=20;;=20= rx-tests.el=20ends=20here.=0A--=20=0A2.17.2=20(Apple=20Git-113)=0A=0A= --Apple-Mail=_D5A358D7-0A23-41FF-BD17-E7237413679F-- From debbugs-submit-bounces@debbugs.gnu.org Sun Feb 24 17:45:08 2019 Received: (at 34641) by debbugs.gnu.org; 24 Feb 2019 22:45:08 +0000 Received: from localhost ([127.0.0.1]:50634 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gy2WB-0004Db-QD for submit@debbugs.gnu.org; Sun, 24 Feb 2019 17:45:08 -0500 Received: from mail-ed1-f51.google.com ([209.85.208.51]:46672) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gy2W9-0004Cn-U3 for 34641@debbugs.gnu.org; Sun, 24 Feb 2019 17:45:06 -0500 Received: by mail-ed1-f51.google.com with SMTP id f2so5993836edy.13 for <34641@debbugs.gnu.org>; Sun, 24 Feb 2019 14:45:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tcd-ie.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-transfer-encoding; bh=SOOUbjZ6ceEMsZKol11jbvgXzxHeJwFVmRAJd0SbOXA=; b=hDpYP4U+KxZ2Q2PIbwn1KqrlVZZzkE0kORHvBejWfv/k/v7rV7zc1OPCTlmXp1OTFq 8zmbXNvMXDXeU5FHayTlO9I81bwBxS15hbWS676cyc5n/5Vca7bCOJgCaVTE44vdp5Y3 NNdlCdGDObJr6y+VNJIGUe5PcstkQGnsyod+wRURe9M9Y2GHMj5qcons1+Fx+Uji70ap ceX3Y0KFUXNc0EfCfqvImqkosrIsLLhzvzWVUKYvsq38cWS1hf+QKhG+de3KXH4bRJsj Q3oPck36hnqjKCqLaimhJ6wS5ZkFu5MDfpHAkRS4UGLv5YslJSha2SHgJV+TCJzvqSFo hfGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version:content-transfer-encoding; bh=SOOUbjZ6ceEMsZKol11jbvgXzxHeJwFVmRAJd0SbOXA=; b=A30fOmYSxJ/Kcm9fG389sykt9cORhtDne26RyMkxXZbKT5K/qnLLtxHJKIPkBUjBib nRAHCvxQWi1/iicD/sNUOBRxoK7Q8rEe/78EwFJlhKw1SSgXc6ofUpcup5VgANYtWLhZ mEhX5JeBQlJYI+coUD6hkmnjgwz56ozA8QfwilzMzlrJIq2jbcGayk1UTxbnVUeD/QiH g4FoAA01SDy/701RxVReesWj+UlUbq9OWycRsBK4UVDEb7ZaXU+62IN8SP5DO24QrFI4 alBFadJkmKxI8nUyuXBifijwLR5RFq60GtSN6YkS5KEmX/lBRUTIQXwXtlDfgJhagsrO Luyw== X-Gm-Message-State: AHQUAubyQ2TwKEDxTBLyon/oAfhQcDMaak8n8FFObHk3ML+3nAoZ6zkP yED4yyOyiYFSdDnmvclVzViveg== X-Google-Smtp-Source: AHgI3IbLdSUplNXHLd5KpXZf7+Is5IYJ7X6T06ucgBlw2xJ6mSY3N0QQS5Ib9fIjv/EhRXC2y0iX2Q== X-Received: by 2002:a17:906:7cd0:: with SMTP id h16mr11017121ejp.126.1551048298955; Sun, 24 Feb 2019 14:44:58 -0800 (PST) Received: from localhost ([2a02:8084:20e2:c380:20c2:134e:4f3a:683a]) by smtp.gmail.com with ESMTPSA id k23sm973089eja.60.2019.02.24.14.44.58 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Sun, 24 Feb 2019 14:44:58 -0800 (PST) From: "Basil L. Contovounesios" To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= Subject: Re: bug#34641: rx: (or ...) order unpredictable References: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> <83bm31ngzp.fsf@gnu.org> <065957BB-1332-458B-8757-742A81CED4A5@acm.org> Date: Sun, 24 Feb 2019 22:44:57 +0000 In-Reply-To: <065957BB-1332-458B-8757-742A81CED4A5@acm.org> ("Mattias =?utf-8?Q?Engdeg=C3=A5rd=22's?= message of "Sun, 24 Feb 2019 22:18:28 +0100") Message-ID: <87mumk957a.fsf@tcd.ie> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 34641 Cc: Eli Zaretskii , 34641@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Mattias Engdeg=C3=A5rd writes: > Is there a preferred or particularly clever never-matching regexp? > If not, would \(?:$\)A do? FWIW, CC Mode has used "a\\`" since the following discussion: https://lists.gnu.org/archive/html/emacs-devel/2018-03/msg00876.html Stefan also suggested to make a variable out of this, but I don't think anything came of that: https://lists.gnu.org/archive/html/emacs-devel/2018-04/msg00047.html --=20 Basil From debbugs-submit-bounces@debbugs.gnu.org Sun Feb 24 21:38:17 2019 Received: (at 34641) by debbugs.gnu.org; 25 Feb 2019 02:38:17 +0000 Received: from localhost ([127.0.0.1]:50724 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gy69o-0001Hq-Nm for submit@debbugs.gnu.org; Sun, 24 Feb 2019 21:38:16 -0500 Received: from mail-ot1-f51.google.com ([209.85.210.51]:39108) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gy69l-0001Hc-V8 for 34641@debbugs.gnu.org; Sun, 24 Feb 2019 21:38:14 -0500 Received: by mail-ot1-f51.google.com with SMTP id n8so6583714otl.6 for <34641@debbugs.gnu.org>; Sun, 24 Feb 2019 18:38:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=e5NuZ1giqMT0i3qS9wamfLC/J/CBHEgma9WmtMsKDZk=; b=YyS/NoLw984PZm6SijmPdqJVnDONJHSerg5+w3qewYK5bcWbewCIbSD2ebz/W2rp9N tzL0Ku4kjVgxtygwSF6+2rX9VFi2GEjytywX0mKqOY2Jblv+Pja1jwsJczce7+XCTMxy 3+YIoLnStPUcPMwN+zbQ6OySSOUFXVCnNTALI7Efo23UXyVenciaUDxIjv8bFJHSEnX8 pgVqviBAcpRRpYla2fMVVIMfm7VL/WlRFXrPGiyVplt2PFCCgd1vffSZVPQDObtN1NnZ pXjHgRESRAoV9gcRTTnuS+h82BlwvIRNB5RU62g3Ol2llCjeIZQjUCmuZxJG/ZqwvFKs yHqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=e5NuZ1giqMT0i3qS9wamfLC/J/CBHEgma9WmtMsKDZk=; b=prhIE6N8vFrPjSD2AyLkmcBf5de2lDBgZG5j1rqCu11KLVgmoBmLUC2/ITO8SqqtWs dDnf0+rXk/2tlWK3Als7/8ceucTMxIRmMrroNfsg5pcHnqCTs0XBax5EP1aN6dP9aB0l DdeTw8qozE7JKK3SahHf6nfpmLdOUxVb0ur4VD4VYP4gRHDPImZnXSpcL1fNAlZSX50v FtBVS0xis46IE6VoVbVtiC7Ghi4UielngJTficYkZgPCFl8eahcVcSVbVX8h4axLSiO4 04XuOKWYiGhmAH04f/zZ4JuAzL6LD6JeVQIRTS772FtEwJE3VJBtpxGzJqApINaxjYXy ntLA== X-Gm-Message-State: AHQUAubH5ajL07xNmbWM3xtWOnVVY4BlLe0wAHUgTf3PRcC63thzwl4p nfldV9cI5EGFOTPf9b8H+iDrTNQRE510Hu0sOMk= X-Google-Smtp-Source: AHgI3IZhiY4VwLX5WSho7+HUAhf0JjKnBYZvx6Wz9JWm32ylY4Mmf1Mw5u2e7n4mc07VAs5m55oQfFXaJTPNtD1z9UU= X-Received: by 2002:a9d:7b49:: with SMTP id f9mr9940151oto.211.1551062287340; Sun, 24 Feb 2019 18:38:07 -0800 (PST) MIME-Version: 1.0 References: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> In-Reply-To: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> From: Noam Postavsky Date: Sun, 24 Feb 2019 21:37:56 -0500 Message-ID: Subject: Re: bug#34641: rx: (or ...) order unpredictable To: =?UTF-8?Q?Mattias_Engdeg=C3=A5rd?= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 34641 Cc: 34641@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On Sun, 24 Feb 2019 at 13:41, Mattias Engdeg=C3=A5rd wro= te: > > The rx (or ...) construct sometimes reorders its subexpressions, which ma= kes its semantics unpredictable. For example, > > (rx (or "ab" "a") (or "a" "ab")) > =3D> > "\\(?:ab?\\)\\(?:ab?\\)" > > The user reasonably expects (or e1 e2) to translate to E1\|E2, where ei t= ranslates to Ei, or a semantic equivalent. I don't see the problem, isn't "ab?" semantically equivalent to "ab\\|a" (and "a\\|ab")? > (Speaking of regexp-opt, it has another bug that does not affect rx: it r= eturns the empty string if given an empty list of strings. The correct retu= rn value is a regexp that never matches anything. This sounds familiar, though I can't locate a report for it. From debbugs-submit-bounces@debbugs.gnu.org Mon Feb 25 04:56:52 2019 Received: (at 34641) by debbugs.gnu.org; 25 Feb 2019 09:56:52 +0000 Received: from localhost ([127.0.0.1]:50837 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gyD0G-0003FH-4H for submit@debbugs.gnu.org; Mon, 25 Feb 2019 04:56:52 -0500 Received: from mail177c50.megamailservers.eu ([91.136.10.187]:52692 helo=mail51c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gyD0D-0003F7-8J for 34641@debbugs.gnu.org; Mon, 25 Feb 2019 04:56:50 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1551088607; bh=GeS/9AwzUyT4Nhgqt3Q7kVsTdBHoYzCtPcruKtlvqes=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=TMY/CkagWLDQmDR6H2n4SIMPHQKXZqB5Am2J6bSaj6Ke2ukL4ikd6YBowSN+LibKQ 42tot+2TP0UeUNmdh9Qau2jNeWQsNZqG0F9iGRWaVsL5ohs+xHNrFiIsD5wwn8Tb03 oVDWuVzHArkb2r/zkQ+S26J1zBNOG4ygQF5JIR7M= Feedback-ID: mattiase@acm.or Received: from [192.168.1.65] (c-e636e253.032-75-73746f71.bbcust.telenor.se [83.226.54.230]) (authenticated bits=0) by mail51c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id x1P9ujCr020207; Mon, 25 Feb 2019 09:56:46 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: bug#34641: rx: (or ...) order unpredictable From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: Date: Mon, 25 Feb 2019 10:56:44 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <07B35E27-3082-4DDD-A1C9-0D8286D40452@acm.org> References: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> To: Noam Postavsky X-Mailer: Apple Mail (2.3445.102.3) X-CTCH-RefID: str=0001.0A0B0214.5C73BBDF.002A, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=a8seC3aF c=1 sm=1 tr=0 a=M+GU/qJco4WXjv8D6jB2IA==:117 a=M+GU/qJco4WXjv8D6jB2IA==:17 a=kj9zAlcOel0A:10 a=pGLkceISAAAA:8 a=MGwT4Oe_cvfwcgMB1h8A:9 a=CjuIK1q_8ugA:10 X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 34641 Cc: 34641@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) 25 feb. 2019 kl. 03.37 skrev Noam Postavsky : >=20 > I don't see the problem, isn't "ab?" semantically equivalent to > "ab\\|a" (and "a\\|ab")? Good question! When the match is anchored at the end, they are indeed = equivalent. They also are equivalent for Posix regexps, which prefer the = longest match. But in Emacs, the first (leftmost) matching alternative = is used. Suppose we are matching against the string "abc". Then ab\|a matches "ab" a\|ab matches "a" ab? matches "ab" ab?? matches "a" (non-greedy operator) (I remember writing, young and foolish, [0-9]+\|0[xX][0-9a-fA-F]+ to = match a number in decimal or hex, and was surprised that all hex numbers = were zero.) >> (Speaking of regexp-opt, it has another bug that does not affect rx: = it returns the empty string if given an empty list of strings. The = correct return value is a regexp that never matches anything. >=20 > This sounds familiar, though I can't locate a report for it. If you do remember, please tell us about it. The `or' operator in SRE can be used with an empty argument list, and = will then not match anything. It is a useful limit case for = machine-generated regexps. From debbugs-submit-bounces@debbugs.gnu.org Mon Feb 25 09:26:25 2019 Received: (at 34641) by debbugs.gnu.org; 25 Feb 2019 14:26:25 +0000 Received: from localhost ([127.0.0.1]:50965 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gyHD7-0003VE-2Z for submit@debbugs.gnu.org; Mon, 25 Feb 2019 09:26:25 -0500 Received: from mail221c50.megamailservers.eu ([91.136.10.231]:44900 helo=mail33c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gyHD4-0003V4-8C for 34641@debbugs.gnu.org; Mon, 25 Feb 2019 09:26:23 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1551104780; bh=UxB5G41ujz2eyfwRn+GOnDWbGvm9rsmqSDUzx6SKJl8=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=BPt48Ehw/z8kudbv65FrCDApsRYpO4QbagZ/D9dmj55JBJJhLblDt44V9lYvF/u71 BHVRjxoBdBm7n1PlvYwolGx3nToVc01Um7e1ADPPT/M8ZTaFBnrY+1Sn7Bg3PXgIj8 w3QySvRpguUWsn0W/NnyVQGmTsp+glOP+9h6siEg= Feedback-ID: mattiase@acm.or Received: from [192.168.1.65] (c-e636e253.032-75-73746f71.bbcust.telenor.se [83.226.54.230]) (authenticated bits=0) by mail33c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id x1PEQIFX022116; Mon, 25 Feb 2019 14:26:20 +0000 From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Message-Id: <759EA0BC-6EE9-4711-A5C9-C631207FF7E5@acm.org> Content-Type: multipart/mixed; boundary="Apple-Mail=_AE387693-4D25-4334-A068-17F80648DE45" Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: bug#34641: rx: (or ...) order unpredictable Date: Mon, 25 Feb 2019 15:26:18 +0100 In-Reply-To: <87mumk957a.fsf@tcd.ie> To: "Basil L. Contovounesios" References: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> <83bm31ngzp.fsf@gnu.org> <065957BB-1332-458B-8757-742A81CED4A5@acm.org> <87mumk957a.fsf@tcd.ie> X-Mailer: Apple Mail (2.3445.102.3) X-CTCH-RefID: str=0001.0A0B0206.5C73FB0C.0041, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=fZaDNXYF c=1 sm=1 tr=0 a=M+GU/qJco4WXjv8D6jB2IA==:117 a=M+GU/qJco4WXjv8D6jB2IA==:17 a=9ek1OtywSLJWtx4roOAA:9 a=CjuIK1q_8ugA:10 a=gYc2_6Fn8gx6W6a85KMA:9 a=B2y7HmGcmWMA:10 X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 34641 Cc: Eli Zaretskii , 34641@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --Apple-Mail=_AE387693-4D25-4334-A068-17F80648DE45 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii 24 feb. 2019 kl. 23.44 skrev Basil L. Contovounesios : > > FWIW, CC Mode has used "a\\`" since the following discussion: Thank you, I'll use that then. Here is a patch (to be applied after the other one). --Apple-Mail=_AE387693-4D25-4334-A068-17F80648DE45 Content-Disposition: attachment; filename=0001-Correct-regexp-opt-return-value-for-empty-string-lis.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-Correct-regexp-opt-return-value-for-empty-string-lis.patch" Content-Transfer-Encoding: quoted-printable =46rom=2028f34a04513254c5bb4507ec6daa510e7ba166da=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Mon,=2025=20Feb=202019=2015:22:02=20+0100=0A= Subject:=20[PATCH]=20Correct=20regexp-opt=20return=20value=20for=20empty=20= string=20list=0A=0AWhen=20regexp-opt=20is=20called=20with=20an=20empty=20= list=20of=20strings,=20return=20a=20regexp=0Athat=20doesn't=20match=20= anything=20instead=20of=20the=20empty=20string=20(Bug#34641).=0A=0A*=20= doc/lispref/searching.texi=20(Regular=20Expression=20Functions):=0A*=20= etc/NEWS:=0ADocument=20the=20new=20behaviour.=0A*=20= lisp/emacs-lisp/regexp-opt.el=20(regexp-opt):=0AReturn=20a=20never-match=20= regexp=20for=20empty=20inputs.=0A---=0A=20doc/lispref/searching.texi=20=20= =20=20|=20=203=20+++=0A=20etc/NEWS=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20|=20=206=20++++++=0A=20= lisp/emacs-lisp/regexp-opt.el=20|=2023=20+++++++++++++++--------=0A=203=20= files=20changed,=2024=20insertions(+),=208=20deletions(-)=0A=0Adiff=20= --git=20a/doc/lispref/searching.texi=20b/doc/lispref/searching.texi=0A= index=2073a7304a3b..0b944a2711=20100644=0A---=20= a/doc/lispref/searching.texi=0A+++=20b/doc/lispref/searching.texi=0A@@=20= -960,6=20+960,9=20@@=20possible.=20=20A=20hand-tuned=20regular=20= expression=20can=20sometimes=20be=20slightly=0A=20more=20efficient,=20= but=20is=20almost=20never=20worth=20the=20effort.}.=0A=20@c=20E.g.,=20= see=20https://debbugs.gnu.org/2816=0A=20=0A+If=20@var{strings}=20is=20= empty,=20the=20return=20value=20is=20a=20regexp=20that=20never=0A= +matches=20anything.=0A+=0A=20The=20optional=20argument=20@var{paren}=20= can=20be=20any=20of=20the=20following:=0A=20=0A=20@table=20@asis=0Adiff=20= --git=20a/etc/NEWS=20b/etc/NEWS=0Aindex=205f7616429b..6506a1c6b5=20= 100644=0A---=20a/etc/NEWS=0A+++=20b/etc/NEWS=0A@@=20-1624,6=20+1624,12=20= @@=20in=20any=20order.=20=20If=20the=20new=20third=20argument=20is=20= non-nil,=20the=20match=20is=0A=20guaranteed=20to=20be=20performed=20in=20= the=20order=20given,=20as=20if=20the=20strings=20were=0A=20made=20into=20= a=20regexp=20by=20joining=20them=20with=20'\|'.=0A=20=0A++++=0A+**=20The=20= function=20'regexp-opt',=20when=20given=20an=20empty=20list=20of=20= strings,=20now=0A+returns=20a=20regexp=20that=20never=20matches=20= anything,=20which=20is=20an=20identity=20for=0A+this=20operation.=20=20= Previously,=20the=20empty=20string=20was=20returned=20in=20this=0A+case.=0A= +=0A=20=0C=0A=20*=20Changes=20in=20Emacs=2027.1=20on=20Non-Free=20= Operating=20Systems=0A=20=0Adiff=20--git=20= a/lisp/emacs-lisp/regexp-opt.el=20b/lisp/emacs-lisp/regexp-opt.el=0A= index=2033a5b770a0..107b453637=20100644=0A---=20= a/lisp/emacs-lisp/regexp-opt.el=0A+++=20b/lisp/emacs-lisp/regexp-opt.el=0A= @@=20-90,6=20+90,9=20@@=20Each=20string=20should=20be=20unique=20in=20= STRINGS=20and=20should=20not=20contain=0A=20any=20regexps,=20quoted=20or=20= not.=20=20Optional=20PAREN=20specifies=20how=20the=0A=20returned=20= regexp=20is=20surrounded=20by=20grouping=20constructs.=0A=20=0A+If=20= STRINGS=20is=20empty,=20the=20return=20value=20is=20a=20regexp=20that=20= never=0A+matches=20anything.=0A+=0A=20The=20optional=20argument=20PAREN=20= can=20be=20any=20of=20the=20following:=0A=20=0A=20a=20string=0A@@=20= -139,14=20+142,18=20@@=20usually=20more=20efficient=20than=20that=20of=20= a=20simplified=20version:=0A=20=09=20=20=20(sorted-strings=20= (delete-dups=0A=20=09=09=09=20=20=20=20(sort=20(copy-sequence=20strings)=20= 'string-lessp)))=0A=20=09=20=20=20(re=0A-=20=20=20=20=20=20=20=20=20=20=20= =20;;=20If=20NOREORDER=20is=20non-nil=20and=20the=20list=20contains=20a=20= prefix=0A-=20=20=20=20=20=20=20=20=20=20=20=20;;=20of=20another=20= string,=20we=20give=20up=20all=20attempts=20at=20optimisation.=0A-=20=20=20= =20=20=20=20=20=20=20=20=20;;=20There=20is=20plenty=20of=20room=20for=20= improvement=20(Bug#34641).=0A-=20=20=20=20=20=20=20=20=20=20=20=20(if=20= (and=20noreorder=20(regexp-opt--contains-prefix=20sorted-strings))=0A-=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20(concat=20(or=20open=20= "\\(?:")=0A-=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20(mapconcat=20#'regexp-quote=20strings=20"\\|")=0A-=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20"\\)")=0A-=20=20= =20=20=20=20=20=20=20=20=20=20=20=20(regexp-opt-group=20sorted-strings=20= (or=20open=20t)=20(not=20open)))))=0A+=20=20=20=20=20=20=20=20=20=20=20=20= (cond=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20;;=20No=20strings:=20= return=20a\`=20which=20cannot=20match=20anything.=0A+=20=20=20=20=20=20=20= =20=20=20=20=20=20((null=20strings)=0A+=20=20=20=20=20=20=20=20=20=20=20=20= =20=20(concat=20(or=20open=20"\\(?:")=20"a\\`\\)"))=0A+=20=20=20=20=20=20= =20=20=20=20=20=20=20;;=20If=20we=20cannot=20reorder,=20give=20up=20all=20= attempts=20at=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20;;=20= optimisation.=20=20There=20is=20room=20for=20improvement=20(Bug#34641).=0A= +=20=20=20=20=20=20=20=20=20=20=20=20=20((and=20noreorder=20= (regexp-opt--contains-prefix=20sorted-strings))=0A+=20=20=20=20=20=20=20=20= =20=20=20=20=20=20(concat=20(or=20open=20"\\(?:")=0A+=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20(mapconcat=20#'regexp-quote=20= strings=20"\\|")=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20"\\)"))=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20(t=0A+=20=20= =20=20=20=20=20=20=20=20=20=20=20=20(regexp-opt-group=20sorted-strings=20= (or=20open=20t)=20(not=20open))))))=0A=20=20=20=20=20=20=20(cond=20((eq=20= paren=20'words)=0A=20=09=20=20=20=20=20(concat=20"\\<"=20re=20"\\>"))=0A=20= =09=20=20=20=20((eq=20paren=20'symbols)=0A--=20=0A2.17.2=20(Apple=20= Git-113)=0A=0A= --Apple-Mail=_AE387693-4D25-4334-A068-17F80648DE45-- From debbugs-submit-bounces@debbugs.gnu.org Mon Feb 25 09:43:26 2019 Received: (at 34641) by debbugs.gnu.org; 25 Feb 2019 14:43:27 +0000 Received: from localhost ([127.0.0.1]:50975 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gyHTa-0003tO-JY for submit@debbugs.gnu.org; Mon, 25 Feb 2019 09:43:26 -0500 Received: from mail-oi1-f175.google.com ([209.85.167.175]:46227) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gyHTX-0003tA-Sr for 34641@debbugs.gnu.org; Mon, 25 Feb 2019 09:43:24 -0500 Received: by mail-oi1-f175.google.com with SMTP id j135so7401087oib.13 for <34641@debbugs.gnu.org>; Mon, 25 Feb 2019 06:43:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=FFKN+5moHrovG3mnWj4LxH5H7pRb/W3n8Ev0EKNMul0=; b=U6zTWIYb+NhIiX/yoDFQOpwV1zZoWeonkHQRZ2XEw0/zeaRM7F69hjs+DbxC/md5dv ppNs8RLdnQeQ8iQpBvQLNBC4KggVDX059lvjCYsby3TmiPio5BSfffw4axUZBGCm5kCh r7GUBVl11gW6o4hHHLxkYVkZinS40GEg6Ar7+Ze9mCDLOiu4YDd6qO/1op5t3EAfVDjI /qK1M+ZalDeGKlRGPvCL6VrIcwBCsKOjpWpIxmo087naMD5PHAPwjMj7Duw6yhVpTLkq xNCUj1Bb2ToLArofbBfCSUk1yDf4mmo0DS8g5hytyFdYpfsWVxqmpTslTKgbkR3u1ovp 1d6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=FFKN+5moHrovG3mnWj4LxH5H7pRb/W3n8Ev0EKNMul0=; b=SkaYqz4Au68wbVucRg8ZWapiI48z6jgBumZTQSTOxUdwYN+LEgASN+buQW6ISdju2v YNh9jKSAI6mLNmVX7sVR4fR8yW8EeNhH+BCkQMqR+Ws+QE5qnffcTP9nOmuep2MPTvC+ 67JTzVr3Xe+e0VZy1XxIXjiJizGlHr+Cd1ys4jTYA759OI9P5MzZTHNUsIBTqZiesZUH mqO5WSzNJm8NgVRUiLwG7bRRBBzvqe2zuWMRoMT7InxFFefYyDf4eSfHwjWACci4hFUT D1+UkglHfcSA9vpUG/QdSQ7U1qPynoG1HozrLOk7VUVnxHl2GRecvpci2+tAiYUWO/C3 h1Iw== X-Gm-Message-State: AHQUAubRKu78z8UwdtwWOkn2WFt9DJVp+FXCUVeR/XC7Fw8MvZK0IPWV rp+fCNjaXtthqGaeIbiNhW6m3IT3xKRIMDDPSLM= X-Google-Smtp-Source: AHgI3IZA1BR8tTUYiNja/Tmd8MnlOZM4XmjHZC7ppy5IzKn4hcyBZGR2iAdpn6gM7fnhhfE3QWzQ9MyKsxbXHm/VAOQ= X-Received: by 2002:aca:5e8b:: with SMTP id s133mr11502604oib.2.1551105798184; Mon, 25 Feb 2019 06:43:18 -0800 (PST) MIME-Version: 1.0 References: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> <07B35E27-3082-4DDD-A1C9-0D8286D40452@acm.org> In-Reply-To: <07B35E27-3082-4DDD-A1C9-0D8286D40452@acm.org> From: Noam Postavsky Date: Mon, 25 Feb 2019 09:43:07 -0500 Message-ID: Subject: Re: bug#34641: rx: (or ...) order unpredictable To: =?UTF-8?Q?Mattias_Engdeg=C3=A5rd?= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 34641 Cc: 34641@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On Mon, 25 Feb 2019 at 04:56, Mattias Engdeg=C3=A5rd wro= te: > Good question! When the match is anchored at the end, they are indeed equ= ivalent. They also are equivalent for Posix regexps, which prefer the longe= st match. But in Emacs, the first (leftmost) matching alternative is used. > > Suppose we are matching against the string "abc". Then > ab\|a matches "ab" > a\|ab matches "a" Oh, huh. So it does. I guess I've never used regexp in a situation where this subtle corner case would come up. > >> (Speaking of regexp-opt, it has another bug that does not affect rx: i= t returns the empty string if given an empty list of strings. The correct r= eturn value is a regexp that never matches anything. > > > > This sounds familiar, though I can't locate a report for it. > > If you do remember, please tell us about it. > The `or' operator in SRE can be used with an empty argument list, and wil= l then not match anything. It is a useful limit case for machine-generated = regexps. Right, found it this time, it's Bug#20307. From debbugs-submit-bounces@debbugs.gnu.org Mon Feb 25 09:48:37 2019 Received: (at 34641) by debbugs.gnu.org; 25 Feb 2019 14:48:37 +0000 Received: from localhost ([127.0.0.1]:50979 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gyHYb-000410-B5 for submit@debbugs.gnu.org; Mon, 25 Feb 2019 09:48:37 -0500 Received: from mail234c50.megamailservers.eu ([91.136.10.244]:41618 helo=mail37c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gyHYZ-00040r-8W for 34641@debbugs.gnu.org; Mon, 25 Feb 2019 09:48:36 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1551106113; bh=kX6LinPH2XUonlPYcwq9PIpkpnE0h99gB2kcYrUCK58=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=Ntzopmz92Y6GGEKq23AXHX+BLLzVPbeil9aSwHUdpbfxPZ7WNPcUPFy4J3r11wQMn u7QxgvKQ3qASw+eLh6ecnfm8te8FSohBF/UoXt2fg4JA9RsA6zZt5BvPvP3qZMkwNi IiXEw6+wmSRJEQ45XckzrRDZZcBT7uUMk/sBEsro= Feedback-ID: mattiase@acm.or Received: from [192.168.1.65] (c-e636e253.032-75-73746f71.bbcust.telenor.se [83.226.54.230]) (authenticated bits=0) by mail37c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id x1PEmVh1005871; Mon, 25 Feb 2019 14:48:33 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: bug#34641: rx: (or ...) order unpredictable From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: Date: Mon, 25 Feb 2019 15:48:31 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> <07B35E27-3082-4DDD-A1C9-0D8286D40452@acm.org> To: Noam Postavsky X-Mailer: Apple Mail (2.3445.102.3) X-CTCH-RefID: str=0001.0A0B0214.5C740041.0020, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=J+uEEjvS c=1 sm=1 tr=0 a=M+GU/qJco4WXjv8D6jB2IA==:117 a=M+GU/qJco4WXjv8D6jB2IA==:17 a=kj9zAlcOel0A:10 a=pGLkceISAAAA:8 a=BYVrX2joZ6vxwFSQtJIA:9 a=CjuIK1q_8ugA:10 a=bexLA5o27-IA:10 X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 34641 Cc: 34641@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) 25 feb. 2019 kl. 15.43 skrev Noam Postavsky : >=20 > Right, found it this time, it's Bug#20307. Excellent! I'll move this part to that bug then, and update the patch to = use that bug number. From debbugs-submit-bounces@debbugs.gnu.org Sat Mar 02 07:33:24 2019 Received: (at 34641) by debbugs.gnu.org; 2 Mar 2019 12:33:24 +0000 Received: from localhost ([127.0.0.1]:57054 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h03pU-0004Q5-0v for submit@debbugs.gnu.org; Sat, 02 Mar 2019 07:33:24 -0500 Received: from eggs.gnu.org ([209.51.188.92]:42366) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h03pS-0004Pr-24 for 34641@debbugs.gnu.org; Sat, 02 Mar 2019 07:33:22 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:40080) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h03pM-0002gq-L1; Sat, 02 Mar 2019 07:33:16 -0500 Received: from [176.228.60.248] (port=4705 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1h03pM-0006xg-8i; Sat, 02 Mar 2019 07:33:16 -0500 Date: Sat, 02 Mar 2019 14:33:05 +0200 Message-Id: <83bm2th2wu.fsf@gnu.org> From: Eli Zaretskii To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= In-reply-to: <065957BB-1332-458B-8757-742A81CED4A5@acm.org> (message from Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Sun, 24 Feb 2019 22:18:28 +0100) Subject: Re: bug#34641: rx: (or ...) order unpredictable References: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> <83bm31ngzp.fsf@gnu.org> <065957BB-1332-458B-8757-742A81CED4A5@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 34641 Cc: 34641@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > From: Mattias Engdegård > Date: Sun, 24 Feb 2019 22:18:28 +0100 > Cc: 34641@debbugs.gnu.org > > > Your preferred solution is fine with me, FWIW. > > Thank you; patch attached. Thanks, LGTM with minor comments below. > +The optional argument @var{noreorder}, if @code{nil}, allows the > +returned regexp to match the strings in any order. If non-@code{nil}, > +the regexp is equivalent to a chain of alternatives (by the @samp{\|} > +operator) of the strings in the order given. I find the text in NEWS much more clear regarding what happens when the new arg is non-nil. I think what is missing is a more explicit description you have in NEWS: If the new third argument is non-nil, the match is guaranteed to be performed in the order given, as if the strings were made into a regexp by joining them with '\|'. So I suggest to mention explicitly the "match is guaranteed to be performed in the order given" part. > +The optional argument NOREORDER, if nil, allows the returned We usually say "if nil or omitted" for optional arguments. > +(defun regexp-opt--contains-prefix (strings) > + "Whether a list of strings contains a proper prefix of one of its elements. > +STRINGS must be sorted and free from duplicates." It is usually a good idea to refer to arguments explicitly in the first sentence of a doc string. In this case, I think just up-casing STRINGS there should be enough. From debbugs-submit-bounces@debbugs.gnu.org Sat Mar 02 09:05:15 2019 Received: (at 34641) by debbugs.gnu.org; 2 Mar 2019 14:05:15 +0000 Received: from localhost ([127.0.0.1]:57115 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h05GM-0006aH-RL for submit@debbugs.gnu.org; Sat, 02 Mar 2019 09:05:15 -0500 Received: from mail234c50.megamailservers.eu ([91.136.10.244]:43996 helo=mail37c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h05GK-0006a6-F5 for 34641@debbugs.gnu.org; Sat, 02 Mar 2019 09:05:13 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1551535510; bh=8K5j7RDRZEQvzAuMQn+xZ0/IV5eVN5qZW//W0AbfQdo=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=XbpdeD0RvDCVVyb+rC3mfnZdfZQ1Mf0lmxkfJGn6NgxFTTjq9Cjo1H32DwDFzSY5P NJwJOD8+Pvb2MGI/LnH5zzHmX4ia+864Z7xDejk1cSO1VQxxHMp2REcc4TDDAB6A6S OMpX8ywGoZMKedqM8+GosJVj3K1Jz8uSZUAF26aA= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c83-251-8-17.bredband.comhem.se [83.251.8.17]) (authenticated bits=0) by mail37c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id x22E58DN015720; Sat, 2 Mar 2019 14:05:10 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: bug#34641: rx: (or ...) order unpredictable From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <83bm2th2wu.fsf@gnu.org> Date: Sat, 2 Mar 2019 15:05:08 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <0EAAABCF-8AA5-473A-A1E7-4B18040697D8@acm.org> References: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> <83bm31ngzp.fsf@gnu.org> <065957BB-1332-458B-8757-742A81CED4A5@acm.org> <83bm2th2wu.fsf@gnu.org> To: Eli Zaretskii X-Mailer: Apple Mail (2.3445.102.3) X-CTCH-RefID: str=0001.0A0B020B.5C7A8D96.0022, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=J+uEEjvS c=1 sm=1 tr=0 a=NAHmi3I8mP0S/Y8gRKeQyA==:117 a=NAHmi3I8mP0S/Y8gRKeQyA==:17 a=kj9zAlcOel0A:10 a=mDV3o1hIAAAA:8 a=N1PMelsx6J91yB7L9u4A:9 a=rwJHU9P0dy_ctKNg:21 a=YoD7R8bGwlvDaQ1T:21 a=CjuIK1q_8ugA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 34641 Cc: 34641@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) 2 mars 2019 kl. 13.33 skrev Eli Zaretskii : >=20 > I find the text in NEWS much more clear regarding what happens when > the new arg is non-nil. I think what is missing is a more explicit > description you have in NEWS: >=20 > If the new third argument is non-nil, the match is > guaranteed to be performed in the order given, as if the strings were > made into a regexp by joining them with '\|'. >=20 > So I suggest to mention explicitly the "match is guaranteed to be > performed in the order given" part. Right; rephrased the doc string and searching.texi. >> +The optional argument NOREORDER, if nil, allows the returned >=20 > We usually say "if nil or omitted" for optional arguments. Understood; used in both places. >> +(defun regexp-opt--contains-prefix (strings) >> + "Whether a list of strings contains a proper prefix of one of its = elements. >> +STRINGS must be sorted and free from duplicates." >=20 > It is usually a good idea to refer to arguments explicitly in the > first sentence of a doc string. In this case, I think just up-casing > STRINGS there should be enough. Rephrased. Thanks for the review; revised patch attached. From debbugs-submit-bounces@debbugs.gnu.org Sat Mar 02 09:08:33 2019 Received: (at 34641) by debbugs.gnu.org; 2 Mar 2019 14:08:33 +0000 Received: from localhost ([127.0.0.1]:57122 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h05JZ-0006fG-Lh for submit@debbugs.gnu.org; Sat, 02 Mar 2019 09:08:33 -0500 Received: from mail176c50.megamailservers.eu ([91.136.10.186]:45192 helo=mail37c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h05JV-0006f4-Og for 34641@debbugs.gnu.org; Sat, 02 Mar 2019 09:08:31 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1551535708; bh=iYpW+jN+YWCBN9Osgyk/7QMx07qrCIDRcd9OAyarORw=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=L+I+uV4pRPKWp7dieL+RIdlzo6t7n8J1z7wutd5WRpFjqs/t9OlYNqgMYlmmrYJTW ca5N6S0B0GYJAd32gOrDQhE+xTI/Cau9+K+E4KJq7wbBbmj8lqHkbpRZTPpll35YTc CLfERiCPBo1SGO1yHYci6nUPHTburn6dIHOztyxk= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c83-251-8-17.bredband.comhem.se [83.251.8.17]) (authenticated bits=0) by mail37c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id x22E8RGh022235; Sat, 2 Mar 2019 14:08:28 +0000 From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= Message-Id: <66BC7C81-523F-42D0-87B8-B27835F8EC4E@acm.org> Content-Type: multipart/mixed; boundary="Apple-Mail=_AF779299-983B-4F08-A139-72890CE86067" Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: bug#34641: rx: (or ...) order unpredictable Date: Sat, 2 Mar 2019 15:08:26 +0100 In-Reply-To: <0EAAABCF-8AA5-473A-A1E7-4B18040697D8@acm.org> To: Eli Zaretskii References: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> <83bm31ngzp.fsf@gnu.org> <065957BB-1332-458B-8757-742A81CED4A5@acm.org> <83bm2th2wu.fsf@gnu.org> <0EAAABCF-8AA5-473A-A1E7-4B18040697D8@acm.org> X-Mailer: Apple Mail (2.3445.102.3) X-CTCH-RefID: str=0001.0A0B020F.5C7A8E5C.0033, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=J+uEEjvS c=1 sm=1 tr=0 a=NAHmi3I8mP0S/Y8gRKeQyA==:117 a=NAHmi3I8mP0S/Y8gRKeQyA==:17 a=N54-gffFAAAA:8 a=XZSj1G6LAKQ7F_pgcUgA:9 a=QEXdDO2ut3YA:10 a=v7zKWsmy5NQA:10 a=kbSAsKhWpfisGHIzqzYA:9 a=B2y7HmGcmWMA:10 a=6l0D2HzqY3Epnrm8mE3f:22 X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 34641 Cc: 34641@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) --Apple-Mail=_AF779299-983B-4F08-A139-72890CE86067 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 2 mars 2019 kl. 15.05 skrev Mattias Engdeg=C3=A5rd : >=20 > Thanks for the review; revised patch attached. Sorry, here it is. --Apple-Mail=_AF779299-983B-4F08-A139-72890CE86067 Content-Disposition: attachment; filename=0001-rx-fix-or-ordering-by-adding-argument-to-regexp-opt.patch Content-Type: application/octet-stream; x-unix-mode=0644; name="0001-rx-fix-or-ordering-by-adding-argument-to-regexp-opt.patch" Content-Transfer-Encoding: quoted-printable =46rom=2078db02e28b451efd6bb9582f5e3e1c38ca47d2f8=20Mon=20Sep=2017=20= 00:00:00=202001=0AFrom:=20=3D?UTF-8?q?Mattias=3D20Engdeg=3DC3=3DA5rd?=3D=20= =0ADate:=20Sun,=2024=20Feb=202019=2022:12:52=20+0100=0A= Subject:=20[PATCH]=20rx:=20fix=20`or'=20ordering=20by=20adding=20= argument=20to=20regexp-opt=0A=0AThe=20rx=20`or'=20form=20may=20reorder=20= its=20arguments=20in=20an=20unpredictable=20way,=0Acontrary=20to=20user=20= expectation,=20since=20it=20sometimes=20uses=20`regexp-opt'.=0AAdd=20a=20= NOREORDER=20option=20to=20`regexp-opt'=20for=20preventing=20it=20from=0A= producing=20a=20reordered=20regexp=20(Bug#34641).=0A=0A*=20= doc/lispref/searching.texi=20(Regular=20Expression=20Functions):=0A*=20= etc/NEWS=20(Lisp=20Changes=20in=20Emacs=2027.1):=0ADescribe=20the=20new=20= regexp-opt=20NOREORDER=20argument.=0A*=20lisp/emacs-lisp/regexp-opt.el=20= (regexp-opt):=20Add=20NOREORDER.=0AMake=20no=20attempt=20at=20regexp=20= improvement=20if=20the=20set=20of=20strings=20contains=0Aa=20prefix=20of=20= another=20string.=0A(regexp-opt--contains-prefix):=20New.=0A*=20= lisp/emacs-lisp/rx.el=20(rx-or):=20Call=20regexp-opt=20with=20NOREORDER.=0A= *=20test/lisp/emacs-lisp/rx-tests.el:=20Test=20rx=20`or'=20form=20match=20= order.=0A---=0A=20doc/lispref/searching.texi=20=20=20=20=20=20=20|=2013=20= ++++++++---=0A=20etc/NEWS=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20|=20=207=20++++++=0A=20= lisp/emacs-lisp/regexp-opt.el=20=20=20=20|=2038=20= ++++++++++++++++++++++++++++----=0A=20lisp/emacs-lisp/rx.el=20=20=20=20=20= =20=20=20=20=20=20=20|=20=202=20+-=0A=20test/lisp/emacs-lisp/rx-tests.el=20= |=2013=20+++++++++++=0A=205=20files=20changed,=2065=20insertions(+),=208=20= deletions(-)=0A=0Adiff=20--git=20a/doc/lispref/searching.texi=20= b/doc/lispref/searching.texi=0Aindex=20cfbd2449b1..fb7f48474d=20100644=0A= ---=20a/doc/lispref/searching.texi=0A+++=20b/doc/lispref/searching.texi=0A= @@=20-950,7=20+950,7=20@@=20whitespace:=0A=20@end=20defun=0A=20=0A=20= @cindex=20optimize=20regexp=0A-@defun=20regexp-opt=20strings=20&optional=20= paren=0A+@defun=20regexp-opt=20strings=20&optional=20paren=20noreorder=0A= =20This=20function=20returns=20an=20efficient=20regular=20expression=20= that=20will=20match=0A=20any=20of=20the=20strings=20in=20the=20list=20= @var{strings}.=20=20This=20is=20useful=20when=20you=0A=20need=20to=20= make=20matching=20or=20searching=20as=20fast=20as=20possible---for=20= example,=0A@@=20-985,8=20+985,15=20@@=20if=20it=20is=20necessary=20to=20= ensure=20that=20a=20postfix=20operator=20appended=20to=0A=20it=20will=20= apply=20to=20the=20whole=20expression.=0A=20@end=20table=0A=20=0A-The=20= resulting=20regexp=20of=20@code{regexp-opt}=20is=20equivalent=20to=20but=20= usually=0A-more=20efficient=20than=20that=20of=20a=20simplified=20= version:=0A+The=20optional=20argument=20@var{noreorder},=20if=20= @code{nil}=20or=20omitted,=0A+allows=20the=20returned=20regexp=20to=20= match=20the=20strings=20in=20any=20order.=20=20If=0A+non-@code{nil},=20= the=20match=20is=20guaranteed=20to=20be=20performed=20in=20the=20order=0A= +given,=20as=20if=20the=20strings=20were=20made=20into=20a=20regexp=20by=20= joining=20them=20with=0A+the=20@samp{\|}=20operator.=0A+=0A+Up=20to=20= reordering,=20the=20resulting=20regexp=20of=20@code{regexp-opt}=20is=0A= +equivalent=20to=20but=20usually=20more=20efficient=20than=20that=20of=20= a=20simplified=0A+version:=0A=20=0A=20@example=0A=20(defun=20= simplified-regexp-opt=20(strings=20&optional=20paren)=0Adiff=20--git=20= a/etc/NEWS=20b/etc/NEWS=0Aindex=2029ed7ab481..7c95988ff5=20100644=0A---=20= a/etc/NEWS=0A+++=20b/etc/NEWS=0A@@=20-1642,6=20+1642,13=20@@=20= MS-Windows.=0A=20**=20New=20module=20environment=20function=20= 'process_input'=20to=20process=20user=0A=20input=20while=20module=20code=20= is=20running.=0A=20=0A++++=0A+**=20The=20function=20'regexp-opt'=20= accepts=20an=20additional=20optional=20argument.=0A+By=20default,=20the=20= regexp=20returned=20by=20'regexp-opt'=20may=20match=20the=20strings=0A= +in=20any=20order.=20=20If=20the=20new=20third=20argument=20is=20= non-nil,=20the=20match=20is=0A+guaranteed=20to=20be=20performed=20in=20= the=20order=20given,=20as=20if=20the=20strings=20were=0A+made=20into=20a=20= regexp=20by=20joining=20them=20with=20'\|'.=0A+=0A=20=0C=0A=20*=20= Changes=20in=20Emacs=2027.1=20on=20Non-Free=20Operating=20Systems=0A=20=0A= diff=20--git=20a/lisp/emacs-lisp/regexp-opt.el=20= b/lisp/emacs-lisp/regexp-opt.el=0Aindex=2063786c1508..d0c5f2d3fc=20= 100644=0A---=20a/lisp/emacs-lisp/regexp-opt.el=0A+++=20= b/lisp/emacs-lisp/regexp-opt.el=0A@@=20-84,7=20+84,7=20@@=0A=20;;;=20= Code:=0A=20=0A=20;;;###autoload=0A-(defun=20regexp-opt=20(strings=20= &optional=20paren)=0A+(defun=20regexp-opt=20(strings=20&optional=20paren=20= noreorder)=0A=20=20=20"Return=20a=20regexp=20to=20match=20a=20string=20= in=20the=20list=20STRINGS.=0A=20Each=20string=20should=20be=20unique=20= in=20STRINGS=20and=20should=20not=20contain=0A=20any=20regexps,=20quoted=20= or=20not.=20=20Optional=20PAREN=20specifies=20how=20the=0A@@=20-111,8=20= +111,14=20@@=20nil=0A=20=20=20=20=20necessary=20to=20ensure=20that=20a=20= postfix=20operator=20appended=20to=20it=20will=0A=20=20=20=20=20apply=20= to=20the=20whole=20expression.=0A=20=0A-The=20resulting=20regexp=20is=20= equivalent=20to=20but=20usually=20more=20efficient=0A-than=20that=20of=20= a=20simplified=20version:=0A+The=20optional=20argument=20NOREORDER,=20if=20= nil=20or=20omitted,=20allows=20the=0A+returned=20regexp=20to=20match=20= the=20strings=20in=20any=20order.=20=20If=20non-nil,=0A+the=20match=20is=20= guaranteed=20to=20be=20performed=20in=20the=20order=20given,=20as=20if=0A= +the=20strings=20were=20made=20into=20a=20regexp=20by=20joining=20them=20= with=20the=0A+`\\|'=20operator.=0A+=0A+Up=20to=20reordering,=20the=20= resulting=20regexp=20is=20equivalent=20to=20but=0A+usually=20more=20= efficient=20than=20that=20of=20a=20simplified=20version:=0A=20=0A=20=20= (defun=20simplified-regexp-opt=20(strings=20&optional=20paren)=0A=20=20=20= =20(let=20((parens=0A@@=20-133,7=20+139,15=20@@=20than=20that=20of=20a=20= simplified=20version:=0A=20=09=20=20=20(open=20(cond=20((stringp=20= paren)=20paren)=20(paren=20"\\(")))=0A=20=09=20=20=20(sorted-strings=20= (delete-dups=0A=20=09=09=09=20=20=20=20(sort=20(copy-sequence=20strings)=20= 'string-lessp)))=0A-=09=20=20=20(re=20(regexp-opt-group=20sorted-strings=20= (or=20open=20t)=20(not=20open))))=0A+=09=20=20=20(re=0A+=20=20=20=20=20=20= =20=20=20=20=20=20;;=20If=20NOREORDER=20is=20non-nil=20and=20the=20list=20= contains=20a=20prefix=0A+=20=20=20=20=20=20=20=20=20=20=20=20;;=20of=20= another=20string,=20we=20give=20up=20all=20attempts=20at=20optimisation.=0A= +=20=20=20=20=20=20=20=20=20=20=20=20;;=20There=20is=20plenty=20of=20= room=20for=20improvement=20(Bug#34641).=0A+=20=20=20=20=20=20=20=20=20=20= =20=20(if=20(and=20noreorder=20(regexp-opt--contains-prefix=20= sorted-strings))=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= (concat=20(or=20open=20"\\(?:")=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20(mapconcat=20#'regexp-quote=20strings=20= "\\|")=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20"\\)")=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20= (regexp-opt-group=20sorted-strings=20(or=20open=20t)=20(not=20open)))))=0A= =20=20=20=20=20=20=20(cond=20((eq=20paren=20'words)=0A=20=09=20=20=20=20=20= (concat=20"\\<"=20re=20"\\>"))=0A=20=09=20=20=20=20((eq=20paren=20= 'symbols)=0A@@=20-313,6=20+327,22=20@@=20CHARS=20should=20be=20a=20list=20= of=20characters."=0A=20=20=20=20=20=20=20=20=20=20=20(concat=20"["=20= dash=20caret=20"]"))=0A=20=20=20=20=20=20=20(concat=20"["=20bracket=20= charset=20caret=20dash=20"]"))))=0A=20=0A+=0A+(defun=20= regexp-opt--contains-prefix=20(strings)=0A+=20=20"Whether=20STRINGS=20= contains=20a=20proper=20prefix=20of=20one=20of=20its=20other=20elements.=0A= +STRINGS=20must=20be=20a=20list=20of=20sorted=20strings=20without=20= duplicates."=0A+=20=20(let=20((s=20strings))=0A+=20=20=20=20;;=20In=20a=20= lexicographically=20sorted=20list,=20a=20string=20always=20immediately=0A= +=20=20=20=20;;=20succeeds=20one=20of=20its=20prefixes.=0A+=20=20=20=20= (while=20(and=20(cdr=20s)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20(not=20(string-equal=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20(car=20s)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20(substring=20(cadr=20s)=200=20(min=20(length=20= (car=20s))=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20(length=20(cadr=20s)))))))=0A+=20=20=20=20=20=20(setq=20s=20= (cdr=20s)))=0A+=20=20=20=20(cdr=20s)))=0A+=0A+=0A=20(provide=20= 'regexp-opt)=0A=20=0A=20;;;=20regexp-opt.el=20ends=20here=0Adiff=20--git=20= a/lisp/emacs-lisp/rx.el=20b/lisp/emacs-lisp/rx.el=0Aindex=20= 715cd608c4..ca756efb49=20100644=0A---=20a/lisp/emacs-lisp/rx.el=0A+++=20= b/lisp/emacs-lisp/rx.el=0A@@=20-393,7=20+393,7=20@@=20FORM=20is=20of=20= the=20form=20`(and=20FORM1=20...)'."=0A=20=20=20(rx-group-if=0A=20=20=20=20= (if=20(memq=20nil=20(mapcar=20'stringp=20(cdr=20form)))=0A=20=20=20=20=20= =20=20=20(mapconcat=20(lambda=20(x)=20(rx-form=20x=20'|))=20(cdr=20form)=20= "\\|")=0A-=20=20=20=20=20(regexp-opt=20(cdr=20form)))=0A+=20=20=20=20=20= (regexp-opt=20(cdr=20form)=20nil=20t))=0A=20=20=20=20(and=20(memq=20= rx-parent=20'(:=20*=20t))=20rx-parent)))=0A=20=0A=20=0Adiff=20--git=20= a/test/lisp/emacs-lisp/rx-tests.el=20b/test/lisp/emacs-lisp/rx-tests.el=0A= index=20e14feda347..fa3d9b0d5e=20100644=0A---=20= a/test/lisp/emacs-lisp/rx-tests.el=0A+++=20= b/test/lisp/emacs-lisp/rx-tests.el=0A@@=20-92,5=20+92,18=20@@=0A=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20(*?=20= "e")=20(+?=20"f")=20(\??=20"g")=20(??=20"h"))))=0A=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20"a*b+c?d?e*?f+?g??h??")))=0A=20=0A= +(ert-deftest=20rx-or=20()=0A+=20=20;;=20Test=20or-pattern=20reordering=20= (Bug#34641).=0A+=20=20(let=20((s=20"abc"))=0A+=20=20=20=20(should=20= (equal=20(and=20(string-match=20(rx=20(or=20"abc"=20"ab"=20"a"))=20s)=0A= +=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= (match-string=200=20s))=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20"abc"))=0A+=20=20=20=20(should=20(equal=20(and=20(string-match=20= (rx=20(or=20"ab"=20"abc"=20"a"))=20s)=0A+=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20(match-string=200=20s))=0A+=20=20=20= =20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20"ab"))=0A+=20=20=20=20= (should=20(equal=20(and=20(string-match=20(rx=20(or=20"a"=20"ab"=20= "abc"))=20s)=0A+=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20(match-string=200=20s))=0A+=20=20=20=20=20=20=20=20=20=20=20= =20=20=20=20=20=20=20=20"a"))))=0A+=0A=20(provide=20'rx-tests)=0A=20;;=20= rx-tests.el=20ends=20here.=0A--=20=0A2.17.2=20(Apple=20Git-113)=0A=0A= --Apple-Mail=_AF779299-983B-4F08-A139-72890CE86067-- From debbugs-submit-bounces@debbugs.gnu.org Sat Mar 02 09:23:27 2019 Received: (at 34641) by debbugs.gnu.org; 2 Mar 2019 14:23:27 +0000 Received: from localhost ([127.0.0.1]:57145 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h05Xy-0000cu-Py for submit@debbugs.gnu.org; Sat, 02 Mar 2019 09:23:27 -0500 Received: from eggs.gnu.org ([209.51.188.92]:40673) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h05Xx-0000ch-2Q for 34641@debbugs.gnu.org; Sat, 02 Mar 2019 09:23:25 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:41674) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h05Xr-0000os-Sm; Sat, 02 Mar 2019 09:23:19 -0500 Received: from [176.228.60.248] (port=3639 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1h05Xr-0006GH-Gc; Sat, 02 Mar 2019 09:23:19 -0500 Date: Sat, 02 Mar 2019 16:23:08 +0200 Message-Id: <8336o5gxtf.fsf@gnu.org> From: Eli Zaretskii To: Mattias =?utf-8?Q?Engdeg=C3=A5rd?= In-reply-to: <66BC7C81-523F-42D0-87B8-B27835F8EC4E@acm.org> (message from Mattias =?utf-8?Q?Engdeg=C3=A5rd?= on Sat, 2 Mar 2019 15:08:26 +0100) Subject: Re: bug#34641: rx: (or ...) order unpredictable References: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> <83bm31ngzp.fsf@gnu.org> <065957BB-1332-458B-8757-742A81CED4A5@acm.org> <83bm2th2wu.fsf@gnu.org> <0EAAABCF-8AA5-473A-A1E7-4B18040697D8@acm.org> <66BC7C81-523F-42D0-87B8-B27835F8EC4E@acm.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 34641 Cc: 34641@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > From: Mattias Engdegård > Date: Sat, 2 Mar 2019 15:08:26 +0100 > Cc: 34641@debbugs.gnu.org > > > Thanks for the review; revised patch attached. > > Sorry, here it is. LGTM, thanks. From debbugs-submit-bounces@debbugs.gnu.org Sat Mar 02 09:37:33 2019 Received: (at 34641-done) by debbugs.gnu.org; 2 Mar 2019 14:37:33 +0000 Received: from localhost ([127.0.0.1]:57150 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h05ld-0002xt-3l for submit@debbugs.gnu.org; Sat, 02 Mar 2019 09:37:33 -0500 Received: from mail179c50.megamailservers.eu ([91.136.10.189]:46120 helo=mail18c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h05la-0002xj-El for 34641-done@debbugs.gnu.org; Sat, 02 Mar 2019 09:37:31 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1551537448; bh=x851CEvFPkzAMcNs0sIOkD2N5dN8YEgsRqLyepyY/MU=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=L7NeD2n0ZxK7boO3SoZaHaNw+Af9ZbOMWgcKxCekKnfqDIhaV+UWv1OUAzy0ivwWM FhzZAq4P5bqVTefVNdJhJvizXQAtcgDG1QoQPQLby8lhWmavVVXh7U1C7VUAmz6PLj VGzkKcfQn5yBh5VI6802hSxkqnRBP56Nt72rNmd8= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c83-251-8-17.bredband.comhem.se [83.251.8.17]) (authenticated bits=0) by mail18c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id x22EbQUC006341; Sat, 2 Mar 2019 14:37:28 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: bug#34641: rx: (or ...) order unpredictable From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: <8336o5gxtf.fsf@gnu.org> Date: Sat, 2 Mar 2019 15:37:26 +0100 Content-Transfer-Encoding: 7bit Message-Id: References: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> <83bm31ngzp.fsf@gnu.org> <065957BB-1332-458B-8757-742A81CED4A5@acm.org> <83bm2th2wu.fsf@gnu.org> <0EAAABCF-8AA5-473A-A1E7-4B18040697D8@acm.org> <66BC7C81-523F-42D0-87B8-B27835F8EC4E@acm.org> <8336o5gxtf.fsf@gnu.org> To: Eli Zaretskii X-Mailer: Apple Mail (2.3445.102.3) X-CTCH-RefID: str=0001.0A0B0211.5C7A9528.0027, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=KOR08mNo c=1 sm=1 tr=0 a=NAHmi3I8mP0S/Y8gRKeQyA==:117 a=NAHmi3I8mP0S/Y8gRKeQyA==:17 a=kj9zAlcOel0A:10 a=mDV3o1hIAAAA:8 a=21forQFsrSIZQMccimYA:9 a=CjuIK1q_8ugA:10 a=ncZ9vwaUYPMA:10 a=_FVE-zBwftR9WsbkzFJk:22 X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 34641-done Cc: 34641-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) 2 mars 2019 kl. 15.23 skrev Eli Zaretskii : > > LGTM, thanks. Thank you, pushed. From debbugs-submit-bounces@debbugs.gnu.org Sat Mar 02 18:48:39 2019 Received: (at 34641) by debbugs.gnu.org; 2 Mar 2019 23:48:39 +0000 Received: from localhost ([127.0.0.1]:57955 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h0EMx-0007co-12 for submit@debbugs.gnu.org; Sat, 02 Mar 2019 18:48:39 -0500 Received: from smtp-4.orcon.net.nz ([60.234.4.59]:52708) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h0EMv-0007ce-18 for 34641@debbugs.gnu.org; Sat, 02 Mar 2019 18:48:37 -0500 Received: from [150.107.172.103] (port=44669 helo=[192.168.20.103]) by smtp-4.orcon.net.nz with esmtpa (Exim 4.86_2) (envelope-from ) id 1h0EMr-0002Zf-U6; Sun, 03 Mar 2019 12:48:34 +1300 Subject: Re: bug#34641: rx: (or ...) order unpredictable To: Eli Zaretskii , =?UTF-8?Q?Mattias_Engdeg=c3=a5rd?= References: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> <83bm31ngzp.fsf@gnu.org> <065957BB-1332-458B-8757-742A81CED4A5@acm.org> <83bm2th2wu.fsf@gnu.org> From: Phil Sainty Message-ID: Date: Sun, 3 Mar 2019 12:48:33 +1300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <83bm2th2wu.fsf@gnu.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit X-GeoIP: NZ X-Spam_score: -2.9 X-Spam_score_int: -28 X-Spam_bar: -- X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 34641 Cc: 34641@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) >> +The optional argument NOREORDER, if nil, allows the returned Could we change that name to have a positive sense? Boolean arguments with a negative sense/meaning are invariably more awkward to read than ones with a positive meaning, IMO. NOREORDER set to nil means "No NOREORDER" (aka "Reorder"). Those double-negatives should be avoided whenever it's simple do do so, as they make things harder for anyone reading the documentation. We could use KEEP-ORDER or RETAIN-ORDER or SAME-ORDER or anything along those lines, and then a 'true' value matches the positive sense of the name, which is much nicer. -Phil From debbugs-submit-bounces@debbugs.gnu.org Sun Mar 03 03:54:20 2019 Received: (at 34641) by debbugs.gnu.org; 3 Mar 2019 08:54:20 +0000 Received: from localhost ([127.0.0.1]:58078 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h0Mt1-00060G-Lt for submit@debbugs.gnu.org; Sun, 03 Mar 2019 03:54:19 -0500 Received: from mail176c50.megamailservers.eu ([91.136.10.186]:57740 helo=mail37c50.megamailservers.eu) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h0Msz-000608-Ff for 34641@debbugs.gnu.org; Sun, 03 Mar 2019 03:54:18 -0500 X-Authenticated-User: mattiase@bredband.net DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=megamailservers.eu; s=maildub; t=1551603255; bh=xHKG+UCD7PD/SODz7VgP8EhMORJMA61bVrIteCcVuvg=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=o3LqTPJQFAOpGhcWJEcpiqZZ3tLoIq99YrSEs+N1qpDqLa1eXxzFj0j+IqNWG0qnt j8UPsyvr2DdhVISdo3NpxeMxo/cNSzUg9tDUQvW03mptUH9agff1G75H7/P2vdssEM fWE1C6I2tPyE/3atMJlAJ5vu0P1hSFdau9DooUsU= Feedback-ID: mattiase@acm.or Received: from [192.168.0.4] (c83-251-8-17.bredband.comhem.se [83.251.8.17]) (authenticated bits=0) by mail37c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id x238sDjW006899; Sun, 3 Mar 2019 08:54:15 +0000 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\)) Subject: Re: bug#34641: rx: (or ...) order unpredictable From: =?utf-8?Q?Mattias_Engdeg=C3=A5rd?= In-Reply-To: Date: Sun, 3 Mar 2019 09:54:13 +0100 Content-Transfer-Encoding: 7bit Message-Id: <0C2599C8-342D-422B-839A-1BABE10E7EB6@acm.org> References: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> <83bm31ngzp.fsf@gnu.org> <065957BB-1332-458B-8757-742A81CED4A5@acm.org> <83bm2th2wu.fsf@gnu.org> To: Phil Sainty X-Mailer: Apple Mail (2.3445.102.3) X-CTCH-RefID: str=0001.0A0B0215.5C7B9637.003B, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.3 cv=J+uEEjvS c=1 sm=1 tr=0 a=NAHmi3I8mP0S/Y8gRKeQyA==:117 a=NAHmi3I8mP0S/Y8gRKeQyA==:17 a=kj9zAlcOel0A:10 a=F1RueMgOAAAA:8 a=IImmbOB2s3G-P8x6oHQA:9 a=CjuIK1q_8ugA:10 a=7ps5cwaF9Li-lmUWtDeZ:22 X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 34641 Cc: Eli Zaretskii , 34641@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) 3 mars 2019 kl. 00.48 skrev Phil Sainty : > >>> +The optional argument NOREORDER, if nil, allows the returned > > Could we change that name to have a positive sense? > > Boolean arguments with a negative sense/meaning are invariably > more awkward to read than ones with a positive meaning, IMO. > > NOREORDER set to nil means "No NOREORDER" (aka "Reorder"). Those > double-negatives should be avoided whenever it's simple do do so, > as they make things harder for anyone reading the documentation. > > We could use KEEP-ORDER or RETAIN-ORDER or SAME-ORDER or anything > along those lines, and then a 'true' value matches the positive > sense of the name, which is much nicer. You are right, and I wasn't happy with the negative name either. Would KEEP-ORDER do? From debbugs-submit-bounces@debbugs.gnu.org Thu Mar 07 04:00:34 2019 Received: (at 34641) by debbugs.gnu.org; 7 Mar 2019 09:00:34 +0000 Received: from localhost ([127.0.0.1]:34910 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h1otF-0006dy-RF for submit@debbugs.gnu.org; Thu, 07 Mar 2019 04:00:34 -0500 Received: from smtp-3.orcon.net.nz ([60.234.4.44]:33272) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1h1otD-0006do-M9 for 34641@debbugs.gnu.org; Thu, 07 Mar 2019 04:00:32 -0500 Received: from [150.107.172.103] (port=4491 helo=[192.168.20.103]) by smtp-3.orcon.net.nz with esmtpa (Exim 4.86_2) (envelope-from ) id 1h1otA-00056m-Cc; Thu, 07 Mar 2019 22:00:28 +1300 Subject: Re: bug#34641: rx: (or ...) order unpredictable To: =?UTF-8?Q?Mattias_Engdeg=c3=a5rd?= References: <836B8DC2-9358-40AC-83AF-7C4D960D9A53@acm.org> <83bm31ngzp.fsf@gnu.org> <065957BB-1332-458B-8757-742A81CED4A5@acm.org> <83bm2th2wu.fsf@gnu.org> <0C2599C8-342D-422B-839A-1BABE10E7EB6@acm.org> From: Phil Sainty Message-ID: <0fabd680-4b55-2199-29bd-ab8129723bef@orcon.net.nz> Date: Thu, 7 Mar 2019 22:00:27 +1300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <0C2599C8-342D-422B-839A-1BABE10E7EB6@acm.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 8bit X-GeoIP: NZ X-Spam_score: -2.9 X-Spam_score_int: -28 X-Spam_bar: -- X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 34641 Cc: 34641@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) On 3/03/19 9:54 PM, Mattias Engdegård wrote: > 3 mars 2019 kl. 00.48 skrev Phil Sainty : >>>> +The optional argument NOREORDER, if nil, allows the returned >> Could we change that name to have a positive sense? > > You are right, and I wasn't happy with the negative name either. > Would KEEP-ORDER do? I think that's good, and no one has objected, so I think you could go ahead with that change. cheers, -Phil From unknown Fri Jun 13 11:00:02 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Thu, 04 Apr 2019 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator