From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 09 17:30:03 2016 Received: (at submit) by debbugs.gnu.org; 9 Nov 2016 22:30:03 +0000 Received: from localhost ([127.0.0.1]:51363 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1c4bNb-0006i4-4G for submit@debbugs.gnu.org; Wed, 09 Nov 2016 17:30:03 -0500 Received: from eggs.gnu.org ([208.118.235.92]:55956) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1c4bNZ-0006h9-B5 for submit@debbugs.gnu.org; Wed, 09 Nov 2016 17:30:02 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c4bNT-0001dx-A6 for submit@debbugs.gnu.org; Wed, 09 Nov 2016 17:29:55 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:57743) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c4bNT-0001ds-7R for submit@debbugs.gnu.org; Wed, 09 Nov 2016 17:29:55 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46917) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c4bNS-0001rf-7T for bug-gnu-emacs@gnu.org; Wed, 09 Nov 2016 17:29:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c4bNN-0001aA-BH for bug-gnu-emacs@gnu.org; Wed, 09 Nov 2016 17:29:54 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:45054) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c4bNN-0001Z3-3L for bug-gnu-emacs@gnu.org; Wed, 09 Nov 2016 17:29:49 -0500 Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id uA9MTlJN017817 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 9 Nov 2016 22:29:47 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0022.oracle.com (8.14.4/8.13.8) with ESMTP id uA9MTlwO022862 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 9 Nov 2016 22:29:47 GMT Received: from abhmp0014.oracle.com (abhmp0014.oracle.com [141.146.116.20]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id uA9MTkbb020783 for ; Wed, 9 Nov 2016 22:29:47 GMT MIME-Version: 1.0 Message-ID: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> Date: Wed, 9 Nov 2016 14:29:45 -0800 (PST) From: Drew Adams To: bug-gnu-emacs@gnu.org Subject: 24.5; isearch-regexp: wrong error message X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9.1 (1003210) [OL 12.0.6753.5000 (x86)] Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Source-IP: aserv0022.oracle.com [141.146.126.234] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) Do this in a large file: 1. C-M-s \(.\|^J\)+ That shows the error message: [error Stack overflow in regexp matcher]. OK, understandable. 2. C-M-s \(.\|^J\)\{,4000\}, where ^J is really a newline char, typed by using `C-q C-j'. No problem with that search. 3. C-M-s \(.\|^J\)\{,40000\} That shows the error message: [incomplete input], which is wrong, IMO. In GNU Emacs 24.5.1 (i686-pc-mingw32) of 2015-04-11 on LEG570 Windowing system distributor `Microsoft Corp.', version 6.1.7601 Configured using: `configure --prefix=3D/c/usr --host=3Di686-pc-mingw32' From debbugs-submit-bounces@debbugs.gnu.org Sun Mar 26 00:24:03 2017 Received: (at control) by debbugs.gnu.org; 26 Mar 2017 04:24:03 +0000 Received: from localhost ([127.0.0.1]:44903 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1crzil-0004wl-3N for submit@debbugs.gnu.org; Sun, 26 Mar 2017 00:24:03 -0400 Received: from mail-it0-f50.google.com ([209.85.214.50]:37794) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1crzij-0004wI-U7 for control@debbugs.gnu.org; Sun, 26 Mar 2017 00:24:02 -0400 Received: by mail-it0-f50.google.com with SMTP id 190so23999890itm.0 for ; Sat, 25 Mar 2017 21:24:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:subject:date:message-id:mime-version; bh=UfqZP8owldBhoqBDTZN8oNY3mEWP8QNdO5zg+VG0ksk=; b=tYMSn8CJt6xQXyYOUlWUnQOD/NtGrsZpOpOVIkt0LP7YH9igRfjAsJhdbnxM0JQczA xPDCDDA6kS5r31sgoq3AHCGgu+suWX6Eqs5kmsaNjLKtp8F2w48VekmMjFRnmPDjc9bj IeBRCK5dI9KI7TdN3kL8TmnUUesWgr4/llXHQHUN8kmICx7kw5jeHg0dnKXCV2MEc20C 13Krj1jBpcdt5Uzs6XJ6rVxwWDBBnnOhdjuAJOspVbtsOg3Pug+OLwDnDLbXZERMXYJh qA0wTxlAH/pQFjbyAzs77UtaPRGUt7blGel6GZ0nVgUjfC4rffZGXIHnLkACZ5l/QZvf 4LhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:subject:date:message-id :mime-version; bh=UfqZP8owldBhoqBDTZN8oNY3mEWP8QNdO5zg+VG0ksk=; b=EBZPAYH1eTPHXR2qkal+zQxP1+uIZEFE/YVsD12qK6dboicYUZCTJ0JoAKHddY0Por 7jed93K8C2qQExbeJkpUZSd98o+G9UloUoqpH6HHtcOs7SGFcDvrsQuSXoljpcWhFCSA vomNM4vG+HLakCeOK+G5sm0umUIh0h9OHqQINgthyu4YnUcN9OlXdSDTObufYFOkZrYU ARUku8tuRSQvMPr3iMq95UaitmDkI/SjZ4oHrylEC3JwONk3aNSR7Q1QWqLi5016RwCJ PsTZ5N2QKLM6njp9Fs2VhmCY+GoDVXdn5qD77v5wNs/v1MVqnbAobVDG8GqM5avdOpYx DwaA== X-Gm-Message-State: AFeK/H2IubokESsEqUesCCKQRJQ5TpytmzK2ww63tURksrEpJVigKL87Npz1nyWd7rzI2Q== X-Received: by 10.107.184.134 with SMTP id i128mr15596719iof.153.1490502236268; Sat, 25 Mar 2017 21:23:56 -0700 (PDT) Received: from zony ([45.2.7.65]) by smtp.googlemail.com with ESMTPSA id w133sm3460124itf.2.2017.03.25.21.23.55 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 25 Mar 2017 21:23:55 -0700 (PDT) From: npostavs@users.sourceforge.net To: control@debbugs.gnu.org Subject: control message for bug #24914 Date: Sun, 26 Mar 2017 00:25:19 -0400 Message-ID: <87r31k4qc0.fsf@users.sourceforge.net> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -0.5 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.5 (/) tags 24914 confirmed found 24914 25.2 quit From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 03 11:38:01 2017 Received: (at 24914) by debbugs.gnu.org; 3 Dec 2017 16:38:01 +0000 Received: from localhost ([127.0.0.1]:43619 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eLXHF-0000NC-9n for submit@debbugs.gnu.org; Sun, 03 Dec 2017 11:38:01 -0500 Received: from mail-io0-f171.google.com ([209.85.223.171]:40443) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eLXHB-0000Mr-K6 for 24914@debbugs.gnu.org; Sun, 03 Dec 2017 11:37:58 -0500 Received: by mail-io0-f171.google.com with SMTP id d21so16131059ioe.7 for <24914@debbugs.gnu.org>; Sun, 03 Dec 2017 08:37:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=rCEHL4zLE9HKYes0QF4yLAFjh+d1XgnmZxVUzE74zok=; b=FenRGUTvvq4V1GLwadf2d/BInPiTJMGmnE91yvgRBMNBcIeNb1RXPzL67ywyy6SCnv 32sQdSlxTFgQvJ1wuZSSW41KFyyijf81SHFqLZKiO2HQ/tbu4e7TZ1QMHGnRfoZ1lB51 FxXppN2a64tmBnyrV1YXCcpcsyQvOhgFTT9KOCSaEdzfEOXpBfB0wt/3mfP1OwPBMniw JmCG+cSxdWBpyK/Jf+s1vQtn2hKN/TxPTu1aAjH6oX1qPEXf7GywWZlMcGPPMpkeFbzj i0mwGDzmfGWp1AQ5r1WnjrHLECZhpBOq0WEi17Qxu6W4e83UigtOpJMpBtM2qqUeR6DD yGcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:references:date :in-reply-to:message-id:user-agent:mime-version; bh=rCEHL4zLE9HKYes0QF4yLAFjh+d1XgnmZxVUzE74zok=; b=s/3EZ3ssLZ8T3/YJNAJuMIbRGbXSFISF75X6z6xFTYmUIHl/TUwNbEJoiY0MembsiP x//rONTXLBsac5/0Z2JqdbeyfBEjJWOZUhamgr+bbOt6n4N1uZQ1+awSiLAWAjEju9bi 2SraMklTV6SHa+1iTcxWvPIG/QCMKtULHgKc9pQBob1AKZqNucjJAfjEK0Hp+CrFwrq5 937fYCAuCy+W9ppdslnnxAoiUT7/AoovvHxlfS0GtB1MODto9R+Hih9pKnkMHT1L8zwX 9QX5vlhUkHjt6fu3/aWDYoGjsnx/jadejxBUC+TB/BvhtXWyGfwmJHbglJBZ2Fo5oZDh tp/g== X-Gm-Message-State: AJaThX5Rfrbt9PLX4XTZLJlhXmE0Kc4s/sOEXWQIvN2n+fLroAUjlXkU SStUOXtmXwa5XAOGuWJYnpId5A== X-Google-Smtp-Source: AGs4zMYxWRnwSNlJdUn5gvh8Wj0sb55wHtdITCCRG25WHt13bMoPWTYvTY3tUYUeFb+8YFnNRYcN8w== X-Received: by 10.107.7.75 with SMTP id 72mr22365119ioh.14.1512319071793; Sun, 03 Dec 2017 08:37:51 -0800 (PST) Received: from zebian ([45.2.119.34]) by smtp.googlemail.com with ESMTPSA id 76sm2751818itk.23.2017.12.03.08.37.49 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sun, 03 Dec 2017 08:37:50 -0800 (PST) From: Noam Postavsky To: Drew Adams Subject: Re: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> Date: Sun, 03 Dec 2017 11:37:49 -0500 In-Reply-To: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> (Drew Adams's message of "Wed, 9 Nov 2016 14:29:45 -0800 (PST)") Message-ID: <87h8t7ix7m.fsf@users.sourceforge.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.90 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Spam-Score: 0.1 (/) X-Debbugs-Envelope-To: 24914 Cc: 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.1 (/) --=-=-= Content-Type: text/plain Drew Adams writes: > 3. C-M-s \(.\|^J\)\{,40000\} > > That shows the error message: [incomplete input], which is wrong, IMO. The reason it doesn't work is because the number of repitions is limited to 32767 (#x7fff). Obviously that should be documented in the manual. As to the error message itself, there isn't really a way to distinguish between incomplete and invalid input, so the only thing I can see to do is to change the message to [incomplete or invalid input]. --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=0001-Document-limitation-of-regexp-repetition-Bug-24914.patch Content-Description: patch >From f7bb728281408170cfe79005b03d2b382a84cdbd Mon Sep 17 00:00:00 2001 From: Noam Postavsky Date: Sat, 2 Dec 2017 19:01:54 -0500 Subject: [PATCH] Document limitation of regexp repetition (Bug#24914) * doc/lispref/searching.texi (Regexp Backslash): Explain that \{m,n\} may only use numbers up to 32767. * lisp/isearch.el (isearch-search): Update error message to include invalid input possibility. --- doc/lispref/searching.texi | 3 ++- lisp/isearch.el | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi index 755fa554bb..92b7e6d17e 100644 --- a/doc/lispref/searching.texi +++ b/doc/lispref/searching.texi @@ -639,7 +639,8 @@ Regexp Backslash is a more general postfix operator that specifies repetition with a minimum of @var{m} repeats and a maximum of @var{n} repeats. If @var{m} is omitted, the minimum is 0; if @var{n} is omitted, there is no -maximum. +maximum. For both forms, @var{m} and @var{n}, if specified, may be no +larger than 32767. For example, @samp{c[ad]\@{1,2\@}r} matches the strings @samp{car}, @samp{cdr}, @samp{caar}, @samp{cadr}, @samp{cdar}, and @samp{cddr}, and diff --git a/lisp/isearch.el b/lisp/isearch.el index 13fa97ea71..dfc5f9f3f7 100644 --- a/lisp/isearch.el +++ b/lisp/isearch.el @@ -2853,7 +2853,7 @@ isearch-search ((string-match "\\`Premature \\|\\`Unmatched \\|\\`Invalid " isearch-error) - (setq isearch-error "incomplete input")) + (setq isearch-error "incomplete or invalid input")) ((and (not isearch-regexp) (string-match "\\`Regular expression too big" isearch-error)) (cond -- 2.11.0 --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 03 13:00:27 2017 Received: (at 24914) by debbugs.gnu.org; 3 Dec 2017 18:00:27 +0000 Received: from localhost ([127.0.0.1]:43708 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eLYZ1-0004F5-KP for submit@debbugs.gnu.org; Sun, 03 Dec 2017 13:00:27 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:30700) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eLYZ0-0004Et-05 for 24914@debbugs.gnu.org; Sun, 03 Dec 2017 13:00:26 -0500 Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id vB3I0ILR022360 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 3 Dec 2017 18:00:19 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id vB3I0HK3030579 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 3 Dec 2017 18:00:17 GMT Received: from abhmp0015.oracle.com (abhmp0015.oracle.com [141.146.116.21]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id vB3I0Fwu016756; Sun, 3 Dec 2017 18:00:16 GMT MIME-Version: 1.0 Message-ID: Date: Sun, 3 Dec 2017 10:00:14 -0800 (PST) From: Drew Adams To: Noam Postavsky Subject: RE: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> In-Reply-To: <87h8t7ix7m.fsf@users.sourceforge.net> X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9.1 (1003210) [OL 16.0.4615.0 (x86)] Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Source-IP: aserv0022.oracle.com [141.146.126.234] X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 24914 Cc: 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) > > 3. C-M-s \(.\|^J\)\{,40000\} > > That shows the error message: [incomplete input], > > which is wrong, IMO. >=20 > The reason it doesn't work is because the number of repitions > is limited to 32767 (#x7fff). Yet another case for adding bignums to Emacs Lisp? I imagine someone will answer that there needs to be a limit. In that case, can we not use something larger? Could we use the value of `most-positive-fixnum'? > Obviously that should be documented in the manual. Yes, please. > As to the error message itself, there isn't really a way > to distinguish between incomplete and invalid input, We do that in some places in the code. Some code parses the regexp, and that code must know (or be able to know) both that the regexp is not incomplete and that the numeral given for the number of repetitions is too large. > so the only thing I can see to do > is to change the message to [incomplete or invalid input]. I suppose that's better than [incomplete], but it doesn't really help users very much. Can we please do that but keep this bug open, hoping that someone will someday provide a real fix? From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 03 13:14:04 2017 Received: (at 24914) by debbugs.gnu.org; 3 Dec 2017 18:14:04 +0000 Received: from localhost ([127.0.0.1]:43717 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eLYmC-0004Y5-8L for submit@debbugs.gnu.org; Sun, 03 Dec 2017 13:14:04 -0500 Received: from mail-it0-f52.google.com ([209.85.214.52]:41736) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eLYmB-0004XU-Ct for 24914@debbugs.gnu.org; Sun, 03 Dec 2017 13:14:03 -0500 Received: by mail-it0-f52.google.com with SMTP id x28so7699332ita.0 for <24914@debbugs.gnu.org>; Sun, 03 Dec 2017 10:14:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=zlD3E0XXnNc6fQA7tXrGFqrK2vAVW+/kbEaa5phPIrw=; b=PE0lZEpwbpyS9xLTBBqyZMK5V+yTsWFupkEGaleNUwjmI8vBHJYjmiI0BCnNPV2qtS Wxv3HbfdfpTaI7mJhzuNVTlgIkty5evobmCOX5jQaa4cOCwnvlmDs7eRVTcxS2MiUWrn HSpdhIJHJ8qnBfccD5pnRHizjBVYmD5pHvmZWGrTgINuCcp8gj/Pv6x0rqpZ+tK/1CX6 ak3FkQUUzqNdsQBX0G5tAbIWtWF7jnTEo04CRC9yy+ucfH0+s0LosBvFvmhuYAtWiQTy +E/wfS5RQVs/VtIxDq35m/ZQKCUZILh/isrfvAEHpP5UVHAIDZ/fWkPq82oMjhIdnpMl L/6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:references:date :in-reply-to:message-id:user-agent:mime-version; bh=zlD3E0XXnNc6fQA7tXrGFqrK2vAVW+/kbEaa5phPIrw=; b=JQfKVgvlOxxTwa/TEHqRfWKjtj9YTCep06lfoExbehLkNj0kX4m3UMxE70Hw/Hd33y HeBTx1lgEeVklnvp3b8uaOjRbo4ImznzbNV+y87V3D0lYkOctogAlKYnhWW1iVOGyTnh 8Thbnd95tFmihYPTx/fQlp5+1csCr0XMf2SWsK5jUW2Ar2CdnZCq6Tplck/bt3s7S0nj fdBDRKzO5fNSnwfETiL7TL1XFeCLy4tedB/edp/ERHy1gq8bs4K0H95kwC0P/DvI31MW mzqHJT6wQz20keaoauX1lk6t5mQy8iJ1M8vLX2pQX8+dsqAh7xT/eeM1U1GSDBtaiwSL zrRg== X-Gm-Message-State: AJaThX7RkfLqgHLAE0pCDt9n7vcFAFg0DK43ESF2yo9hSrZmcaVtC/CQ jfZHNz9LAm4XUjVjwL/lqsAmqQ== X-Google-Smtp-Source: AGs4zMbFHY5NHHtdiqDouAcqKkJHqpni5bF+3OOuRtg0zu5+SMhjnY1fgsXSfG+Nz6pR4fRjUUYY4w== X-Received: by 10.107.8.140 with SMTP id h12mr20048761ioi.270.1512324837488; Sun, 03 Dec 2017 10:13:57 -0800 (PST) Received: from zebian ([45.2.119.34]) by smtp.googlemail.com with ESMTPSA id j204sm2811575itj.16.2017.12.03.10.13.55 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sun, 03 Dec 2017 10:13:56 -0800 (PST) From: Noam Postavsky To: Drew Adams Subject: Re: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> Date: Sun, 03 Dec 2017 13:13:54 -0500 In-Reply-To: (Drew Adams's message of "Sun, 3 Dec 2017 10:00:14 -0800 (PST)") Message-ID: <87d13visrh.fsf@users.sourceforge.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.90 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.1 (/) X-Debbugs-Envelope-To: 24914 Cc: 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.1 (/) Drew Adams writes: >> > 3. C-M-s \(.\|^J\)\{,40000\} >> > That shows the error message: [incomplete input], >> > which is wrong, IMO. >> >> The reason it doesn't work is because the number of repitions >> is limited to 32767 (#x7fff). > > Yet another case for adding bignums to Emacs Lisp? > I imagine someone will answer that there needs to be > a limit. > > In that case, can we not use something larger? > Could we use the value of `most-positive-fixnum'? It's not a limit in Lisp, but in regex.c. >> As to the error message itself, there isn't really a way >> to distinguish between incomplete and invalid input, > > We do that in some places in the code. What places are those? > Some code parses the regexp, and that code must know (or be able to > know) both that the regexp is not incomplete What does it mean for a regexp to be incomplete or not? As far as I can tell, the only distinction is that the user means to type more; but the code doesn't know what will happen in the future... > and that the numeral > given for the number of repetitions is too large. I suppose we could change regex.c to give a different error message for a repetition number that is too high, and then isearch.el could check for that specially. From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 03 13:56:44 2017 Received: (at 24914) by debbugs.gnu.org; 3 Dec 2017 18:56:44 +0000 Received: from localhost ([127.0.0.1]:43759 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eLZRT-0005b2-1x for submit@debbugs.gnu.org; Sun, 03 Dec 2017 13:56:44 -0500 Received: from aserp2120.oracle.com ([141.146.126.78]:38946) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eLZRQ-0005am-Mt for 24914@debbugs.gnu.org; Sun, 03 Dec 2017 13:56:41 -0500 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.21/8.16.0.21) with SMTP id vB3Igt6V103450; Sun, 3 Dec 2017 18:56:34 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=mime-version : message-id : date : from : sender : to : cc : subject : references : in-reply-to : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=d5Ydzsbs6D3/8x5dGvdaXi8qzkdQmWWJGkyDkv9b4KI=; b=KDruA7Y5QcMUObLFiOoJBnB5UFyCZxBrdLIXwmGGzVFT+Ghhfr2n4OoLoHPDCJu6kDM2 3WSsY6qjdysKHgl2/9S1PSyhl0w19Z5eub3Ae79PJnWNOcesVEgzWvz4VgjlxDy/9gWv HkWVfZVHdL7zZqnl9/cxHggcSwE0UdhCsMIdd+ozL2Y50L3BLCZ7U+Bvm7zmvQZFEa+h 5PznolZWoasOmQlqWdQvVX+Zpz8u2naWJbtrrKlisLbKilPHS7DSvY5QVzXZtS6RSmt3 WlIqjyCepeUbKtZGfZ+LO/Wi/szIsPAlSkhuRzdiiVWMMjAwigMK/vAf/PiXw5uWLIrj pA== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp2120.oracle.com with ESMTP id 2ekpeuhjdr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 03 Dec 2017 18:56:34 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id vB3IuXev004628 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 3 Dec 2017 18:56:34 GMT Received: from abhmp0015.oracle.com (abhmp0015.oracle.com [141.146.116.21]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id vB3IuXRA031390; Sun, 3 Dec 2017 18:56:33 GMT MIME-Version: 1.0 Message-ID: Date: Sun, 3 Dec 2017 10:56:32 -0800 (PST) From: Drew Adams To: Noam Postavsky Subject: RE: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> In-Reply-To: <87d13visrh.fsf@users.sourceforge.net> X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9.1 (1003210) [OL 16.0.4615.0 (x86)] Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8734 signatures=668637 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1712030285 X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 24914 Cc: 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) > >> > 3. C-M-s \(.\|^J\)\{,40000\} > >> > That shows the error message: [incomplete input], > >> > which is wrong, IMO. > >> > >> The reason it doesn't work is because the number of repitions > >> is limited to 32767 (#x7fff). > > > > Yet another case for adding bignums to Emacs Lisp? > > I imagine someone will answer that there needs to be > > a limit. > > > > In that case, can we not use something larger? > > Could we use the value of `most-positive-fixnum'? >=20 > It's not a limit in Lisp, but in regex.c. We can't use something larger there? > >> As to the error message itself, there isn't really a way > >> to distinguish between incomplete and invalid input, > > > > We do that in some places in the code. >=20 > What places are those? In the Lisp code, at least, there are a few places where we provide an error that is specific to an invalid regexp. Search for handling of standard error `invalid-regexp', for instance. But if this is handled only in C code then you might want to look there instead. > > Some code parses the regexp, and that code must know (or be able to > > know) both that the regexp is not incomplete >=20 > What does it mean for a regexp to be incomplete or not? As far as I can > tell, the only distinction is that the user means to type more; but the > code doesn't know what will happen in the future... Presumably that term is used only for cases where we can be sure that in order for the regexp to be valid there would need to be further input. `foo' is not incomplete, whether or not the user "means to type more". `[^' is incomplete, because it can be made valid only by typing more. > > and that the numeral > > given for the number of repetitions is too large. >=20 > I suppose we could change regex.c to give a different error message for > a repetition number that is too high, and then isearch.el could check > for that specially. That would be great. From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 04 01:27:38 2017 Received: (at 24914) by debbugs.gnu.org; 4 Dec 2017 06:27:38 +0000 Received: from localhost ([127.0.0.1]:44198 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eLkE6-0003L0-9D for submit@debbugs.gnu.org; Mon, 04 Dec 2017 01:27:38 -0500 Received: from mail-it0-f47.google.com ([209.85.214.47]:36163) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eLkE4-0003J7-PF for 24914@debbugs.gnu.org; Mon, 04 Dec 2017 01:27:37 -0500 Received: by mail-it0-f47.google.com with SMTP id d16so4011509itj.1 for <24914@debbugs.gnu.org>; Sun, 03 Dec 2017 22:27:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=59HXhpyk+KoLBuuvbpXupfm6sIQkkQ0MjbX00KQVzh4=; b=BdhUBGlXDcW+V/x0Se+9moGZ73DISsTj1kKxbaNOYU/ZZaRRMTG2UV3YYVsyuCRrm6 lj+3mIixleqfVGz/Q7frbJBqOQBzACDv4JY+A8/KoA+Rvjj/qFr+mjA+z8BmnDs1LTKD qg5oA8YG6qn6kr+CaCbJwOk/fC96MJQ2u+IV42fGPzPkir4GIi9y3Gn+0ecuxgsHmn5Z GXdXu5YZgmZ77fUuSiOPaEz3Oi4cv3VAWeCukWDtgjoFyWouwuN8lV2ZC2JtYDqkPKEZ M1Av3ZFQiU4hmZVZkDTEJTV6PFDYtX0WBTo0wZg+OMStzVNVfsqnkRY+S5k1WHQyXZjs pu9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:references:date :in-reply-to:message-id:user-agent:mime-version; bh=59HXhpyk+KoLBuuvbpXupfm6sIQkkQ0MjbX00KQVzh4=; b=b2R64pjHEwbtMaNEsAwWsgVqGDD9eG8bCW4OB8P3N9pYkYsWidceOkPdQLHdnEpRDi AP2JCHGerErFfQVljw1KnUGAmaI5KRBrUDELh6A35zbGIuLRpUjk0OGeNKwzzLfeIyyf lX7xwLvUiu2i+JsOx3dpPuP+PioQhdJiDZ+hePfOLRLfKuI21QXH2LAnYNRAxIF73Mwm ZmzoA4rt/qyojvBNKeEbPLvf4zsTtaQOGElQUILEHPYPFNKgBQ7guRayxInRRTtxV5lI G6xbvTp+6EmSG2n5RUxRLzwAGTrp56WNOf5DvNbwE72Pjhsp1EvbUX1XlGFoBf6lMmYe PYjw== X-Gm-Message-State: AKGB3mKFTIAFA5GYdGE9iRmtyxp92jh97F3H2GIYAnKnTDWd5F0mOVXX HIAg9j0PsL+IChEesWjG8/y4tQ== X-Google-Smtp-Source: AGs4zMa7Hf5CFQDq6qxBCCJIDpE51SDDB55XBb8saWIdWMMiTQbA9N8hfB0WbBuoZuHNAbWY4ve2jw== X-Received: by 10.36.22.147 with SMTP id a141mr12039089ita.30.1512368850770; Sun, 03 Dec 2017 22:27:30 -0800 (PST) Received: from zebian ([45.2.119.34]) by smtp.googlemail.com with ESMTPSA id r190sm5584751iod.7.2017.12.03.22.27.28 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sun, 03 Dec 2017 22:27:29 -0800 (PST) From: Noam Postavsky To: Drew Adams Subject: Re: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> Date: Mon, 04 Dec 2017 01:27:27 -0500 In-Reply-To: (Drew Adams's message of "Sun, 3 Dec 2017 10:56:32 -0800 (PST)") Message-ID: <87shcrgg8g.fsf@users.sourceforge.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.90 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.1 (/) X-Debbugs-Envelope-To: 24914 Cc: 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.1 (/) Drew Adams writes: >> It's not a limit in Lisp, but in regex.c. > > We can't use something larger there? Hmm, right, actually I see in regex.h: /* If sizeof(int) == 2, then ((1 << 15) - 1) overflows. */ #define RE_DUP_MAX (0x7fff) Does Emacs even support 16 bit platforms? >> >> As to the error message itself, there isn't really a way >> >> to distinguish between incomplete and invalid input, >> > >> > We do that in some places in the code. >> >> What places are those? > > In the Lisp code, at least, there are a few places where > we provide an error that is specific to an invalid regexp. > Search for handling of standard error `invalid-regexp', > for instance. As far as I can tell, none of those places (apart from isearch.el, the subject of this bug) try to flag "incomplete" regexps, only invalid or valid. >> > Some code parses the regexp, and that code must know (or be able to >> > know) both that the regexp is not incomplete >> >> What does it mean for a regexp to be incomplete or not? As far as I can >> tell, the only distinction is that the user means to type more; but the >> code doesn't know what will happen in the future... > > Presumably that term is used only for cases where we can > be sure that in order for the regexp to be valid there > would need to be further input. `foo' is not incomplete, > whether or not the user "means to type more". `[^' is > incomplete, because it can be made valid only by typing > more. Is `\\{100,20\\}' incomplete? Because it could be made valid by the user adding a 0 after the 20 to become '\\{100,200\\}'. Actually, I'm wondering what's the point of isearch showing "incomplete" instead of the actual regexp invalid error. I.e., why not instead of \ [incomplete] \{ [incomplete] \{4 [incomplete] \{4000 [incomplete] \{4000\ [incomplete] \{4000\} show this: \ [Trailing backslash] \{ [Unmatched \{] \{4 [Unmatched \{] \{4000 [Unmatched \{] \{4000\ [Trailing backslash] \{4000\} From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 04 09:52:38 2017 Received: (at 24914) by debbugs.gnu.org; 4 Dec 2017 14:52:38 +0000 Received: from localhost ([127.0.0.1]:44495 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eLs6o-0000Qt-Cu for submit@debbugs.gnu.org; Mon, 04 Dec 2017 09:52:38 -0500 Received: from aserp2130.oracle.com ([141.146.126.79]:52791) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eLs6m-0000Qg-O7 for 24914@debbugs.gnu.org; Mon, 04 Dec 2017 09:52:37 -0500 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.21/8.16.0.21) with SMTP id vB4EqGXP132946; Mon, 4 Dec 2017 14:52:30 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=mime-version : message-id : date : from : sender : to : cc : subject : references : in-reply-to : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=qD1a4M+OkCVPrmSJgXiCfxkdXAYMbXJpOkKjvcOyBQ8=; b=Z/QY+TOUQZTCrabyngroRptnQOlDjqBDY4Y5EpI8ta+PZ3BaajKLnIEljiTo1rYjR1Qm 1ibF3odld74nOBA5oJFI+n1MvR5ObMQCAcQ3gKKl1dyZ0EU3Do5E71FqU2cFLu3BuAC7 3nZW/3S6mKzZ4eVsZXD98yL4V9PWIed4WpBNdiae1irsQW+a8C2jw+sCseKWGMf8cFSQ zoZhfIymYY5cx/TFVSiGhqwgQV3qrHaokHgxm4n3h+qBgUgBEn1Sz5blF9ne36hY1UXf MxHmTM7F83XDtXMqXN1tzorm8k9YrNi7Y5S4CP0sxnWulaRiaiUOCp6hatAbll79RFL5 Wg== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp2130.oracle.com with ESMTP id 2embdcjeku-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 04 Dec 2017 14:52:30 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id vB4EqTh5028742 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 4 Dec 2017 14:52:29 GMT Received: from abhmp0004.oracle.com (abhmp0004.oracle.com [141.146.116.10]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id vB4EqSxi014027; Mon, 4 Dec 2017 14:52:28 GMT MIME-Version: 1.0 Message-ID: Date: Mon, 4 Dec 2017 06:52:27 -0800 (PST) From: Drew Adams To: Noam Postavsky Subject: RE: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> <87shcrgg8g.fsf@users.sourceforge.net> In-Reply-To: <87shcrgg8g.fsf@users.sourceforge.net> X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9.1 (1003210) [OL 16.0.4615.0 (x86)] Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8734 signatures=668637 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1712040217 X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 24914 Cc: 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) > >> >> As to the error message itself, there isn't really a way > >> >> to distinguish between incomplete and invalid input, > >> > > >> > We do that in some places in the code. > >> > >> What places are those? > > > > In the Lisp code, at least, there are a few places where > > we provide an error that is specific to an invalid regexp. > > Search for handling of standard error `invalid-regexp', > > for instance. >=20 > As far as I can tell, none of those places (apart from isearch.el, the > subject of this bug) try to flag "incomplete" regexps, only invalid or > valid. Isn't that the point? In the case in question the regexp is not incomplete. It is "invalid" because the occurrences count is too high. Showing a message that says it is incomplete is wrong - that was the point of this report. What I cited are cases where we do flag _particular kinds_ of invalid regexps, and so tailor the error msg. That's what could be hoped for in the current case too: ideally, show a msg that says that the occurrences count is too high. If that can't be detected exactly then perhaps we can get close - e.g., invalid occurrences count or some such. > >> > Some code parses the regexp, and that code must know > >> > (or be able to know) both that the regexp is not incomplete > >> > >> What does it mean for a regexp to be incomplete or not? > >> As far as I can tell, the only distinction is that the > >> user means to type more; but the code doesn't know what > >> will happen in the future... > > > > Presumably that term is used only for cases where we can > > be sure that in order for the regexp to be valid there > > would need to be further input. `foo' is not incomplete, > > whether or not the user "means to type more". `[^' is > > incomplete, because it can be made valid only by typing > > more. >=20 > Is `\\{100,20\\}' incomplete? Because it could be made valid > by the user adding a 0 after the 20 to become '\\{100,200\\}'. Of course, a user could always use `M-e' to edit the search pattern and type 0 before the \\}. But our isearch messages don't take that kind of thing into account. They assume the cursor is at the _end_ of the search pattern, so that further input is appended to the pattern. An incomplete-regexp message means (so far, aside from bugs like this one or perhaps cases where Emacs cannot do better) that we expect you to keep typing - at the end of the search pattern, to complete a valid regexp. > Actually, I'm wondering what's the point of isearch showing > "incomplete" instead of the actual regexp invalid error. > I.e., why not instead of >=20 > \ [incomplete] > \{ [incomplete] > \{4 [incomplete] > \{4000 [incomplete] > \{4000\ [incomplete] > \{4000\} >=20 > show this: >=20 > \ [Trailing backslash] > \{ [Unmatched \{] > \{4 [Unmatched \{] > \{4000 [Unmatched \{] > \{4000\ [Trailing backslash] > \{4000\} Feel free to work on that. You might run into some cases that are not so clear-cut. But you might well improve things generally in some way. The problem with that is (I suppose) that it is not, in general, straightforward what would be needed to make the current pattern a valid regexp. In particular, there might be multiple ways to make it valid. Trying to describe what you're expecting, as possible appended input that would make for a valid regexp, would be hard. And doing it accurately, even when feasible, would lead to complex error msgs. It's maybe more user-friendly to just indicate that, so far, the regexp is not valid, but that it could become valid by appending something (i.e., without trying to accurately characterize that something). Anyway, unless working on that is needed or appropriate for fixing the reported bug, that should perhaps be dealt with by a separate bug (enhancement request). From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 04 10:18:30 2017 Received: (at 24914) by debbugs.gnu.org; 4 Dec 2017 15:18:30 +0000 Received: from localhost ([127.0.0.1]:45615 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eLsVq-0001Lj-AJ for submit@debbugs.gnu.org; Mon, 04 Dec 2017 10:18:30 -0500 Received: from eggs.gnu.org ([208.118.235.92]:32834) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eLsVp-0001LW-2j for 24914@debbugs.gnu.org; Mon, 04 Dec 2017 10:18:29 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eLsVf-0004Db-U4 for 24914@debbugs.gnu.org; Mon, 04 Dec 2017 10:18:24 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20,T_RP_MATCHES_RCVD, URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:55922) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eLsVf-0004DW-Qb; Mon, 04 Dec 2017 10:18:19 -0500 Received: from [176.228.60.248] (port=4220 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1eLsVf-0004ZP-5t; Mon, 04 Dec 2017 10:18:19 -0500 Date: Mon, 04 Dec 2017 17:18:08 +0200 Message-Id: <83a7yyzfm7.fsf@gnu.org> From: Eli Zaretskii To: Noam Postavsky In-reply-to: <87shcrgg8g.fsf@users.sourceforge.net> (message from Noam Postavsky on Mon, 04 Dec 2017 01:27:27 -0500) Subject: Re: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> <87shcrgg8g.fsf@users.sourceforge.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 24914 Cc: drew.adams@oracle.com, 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > From: Noam Postavsky > Date: Mon, 04 Dec 2017 01:27:27 -0500 > Cc: 24914@debbugs.gnu.org > > Drew Adams writes: > > >> It's not a limit in Lisp, but in regex.c. > > > > We can't use something larger there? > > Hmm, right, actually I see in regex.h: > > /* If sizeof(int) == 2, then ((1 << 15) - 1) overflows. */ > #define RE_DUP_MAX (0x7fff) > > Does Emacs even support 16 bit platforms? Emacs never did (the MS-DOS port of Emacs runs in i386 32-bit protected mode on top of a 16-bit OS). But regex.c did, at some very distant past, to support the 16-bit MS compiler, or at least it tried to. From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 04 20:18:13 2017 Received: (at 24914) by debbugs.gnu.org; 5 Dec 2017 01:18:13 +0000 Received: from localhost ([127.0.0.1]:46157 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eM1sC-0008NT-Ts for submit@debbugs.gnu.org; Mon, 04 Dec 2017 20:18:13 -0500 Received: from mail-it0-f44.google.com ([209.85.214.44]:37525) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eM1sB-0008NE-9m for 24914@debbugs.gnu.org; Mon, 04 Dec 2017 20:18:11 -0500 Received: by mail-it0-f44.google.com with SMTP id d137so11396851itc.2 for <24914@debbugs.gnu.org>; Mon, 04 Dec 2017 17:18:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=TBZtFW3e74UDOXnxrcJBH1tN/+OIG0UGjj0WW4Kv4zU=; b=oZAgrjDEU0OGYTEvAtODDxL1KXOxo4ztivdbiTxkNA2OeRGVHUVJpGUvcgb55PzyGo BuLIrYIcmTnMwUXFFnNViPBPFWquQeqYMkcHWPWhNGLVfKIiwhCl7pymVlkUJQ0z/Y2X vFkIhjWZ+17pvuGcm0JulIGD0UnglIdb2nRfbqWy3wOAc/MvVkcfApcni3VgF2lo+5Qv qXGX6XnX7iUxYuv21cGwWZAnQCFCbD8dblcmiDEWat58MXiChO0Vr0HLxq/ELudlU3Vc Uv2lhuc11rnEbXIMDDgDfMOkwU/C/eP2JKae5ZNOSjga+oQTVfuoQdasJO5O4ES4zMAE PQyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:references:date :in-reply-to:message-id:user-agent:mime-version; bh=TBZtFW3e74UDOXnxrcJBH1tN/+OIG0UGjj0WW4Kv4zU=; b=a3O8Za0FkcHUk42CyxyzJPOFzrIE2CHFhFlLSDm981ePwHHiPJl3oo/2u8hB0KD44n bFwJepERYUj4mFeGtGdmypQbDT+m6XFSyfuVTX/Hicnr5by6WjmDQHHYECcngcuA5lUq /jRSOT51QdtBC/Zike12kgV9UK7/CxG2WE/1z2o5z07ggYQ3xQ6LEnSGJQyQZBPx2UpB GojZzBZXbHsJbLVFcrVFcA9Q967ywS60bO4IzgUJmEWVTLuD4Q3J5XfsYYYH8MYIakUJ TDyIDIxLlgB+t9hcihtXhU2O2CAMhdyqDENb4QavHQvMA/T0TDi/af5FbxgLQKVkv0Sg 5xeQ== X-Gm-Message-State: AKGB3mKsXJQO5yCJe9VFmhn8bD1SxGtJaNFo1Q/AcaGtsELj7LM+jIEd z6DV/zJXEMDU8lM1sCoJkmc5Fg== X-Google-Smtp-Source: AGs4zMaQ4jrli4Xk4txwuDv3ch5eVcbDllVb4dQRgQpYsq1Rc7vA2s9CASjHMmdfIPoG69UwRr7Y6A== X-Received: by 10.36.110.14 with SMTP id w14mr15714441itc.100.1512436685272; Mon, 04 Dec 2017 17:18:05 -0800 (PST) Received: from zebian ([45.2.119.34]) by smtp.googlemail.com with ESMTPSA id f90sm6602056ioi.30.2017.12.04.17.18.03 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 04 Dec 2017 17:18:04 -0800 (PST) From: Noam Postavsky To: Drew Adams Subject: Re: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> <87shcrgg8g.fsf@users.sourceforge.net> Date: Mon, 04 Dec 2017 20:18:02 -0500 In-Reply-To: (Drew Adams's message of "Mon, 4 Dec 2017 06:52:27 -0800 (PST)") Message-ID: <87h8t6gegl.fsf@users.sourceforge.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.90 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.2 (/) X-Debbugs-Envelope-To: 24914 Cc: 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.2 (/) Drew Adams writes: > What I cited are cases where we do flag _particular kinds_ > of invalid regexps, and so tailor the error msg. I'm not sure if you're citing actual code we have right now, or just some hypotheticals. In isearch.el, we pretty much do the opposite of tailor the error message. >> Actually, I'm wondering what's the point of isearch showing >> "incomplete" instead of the actual regexp invalid error. >> I.e., why not instead of >> >> \ [incomplete] >> \{ [incomplete] >> \{4 [incomplete] >> \{4000 [incomplete] >> \{4000\ [incomplete] >> \{4000\} >> >> show this: >> >> \ [Trailing backslash] >> \{ [Unmatched \{] >> \{4 [Unmatched \{] >> \{4000 [Unmatched \{] >> \{4000\ [Trailing backslash] >> \{4000\} > > Feel free to work on that. You might run into some cases > that are not so clear-cut. But you might well improve > things generally in some way. I meant just the following patch, you can try it out easily: --- i/lisp/isearch.el +++ w/lisp/isearch.el @@ -2850,10 +2850,6 @@ isearch-search (invalid-regexp (setq isearch-error (car (cdr lossage))) (cond - ((string-match - "\\`Premature \\|\\`Unmatched \\|\\`Invalid " - isearch-error) - (setq isearch-error "incomplete input")) ((and (not isearch-regexp) (string-match "\\`Regular expression too big" isearch-error)) (cond Eli Zaretskii writes: >> >> /* If sizeof(int) == 2, then ((1 << 15) - 1) overflows. */ >> #define RE_DUP_MAX (0x7fff) >> >> Does Emacs even support 16 bit platforms? > > Emacs never did (the MS-DOS port of Emacs runs in i386 32-bit > protected mode on top of a 16-bit OS). But regex.c did, at some very > distant past, to support the 16-bit MS compiler, or at least it tried > to. So changing to 2^31 as the max should be fine, right? --- i/src/regex.h +++ w/src/regex.h @@ -270,8 +270,10 @@ #ifdef RE_DUP_MAX # undef RE_DUP_MAX #endif -/* If sizeof(int) == 2, then ((1 << 15) - 1) overflows. */ -#define RE_DUP_MAX (0x7fff) +/* If sizeof(int) == 4, then ((1 << 31) - 1) overflows. This used to + be limited to 0x7fff, but Emacs never supported 16 bit platforms + anyway. */ +#define RE_DUP_MAX (0x7fffffff) /* POSIX `cflags' bits (i.e., information for `regcomp'). */ From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 04 22:16:07 2017 Received: (at 24914) by debbugs.gnu.org; 5 Dec 2017 03:16:07 +0000 Received: from localhost ([127.0.0.1]:46233 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eM3iJ-0004WF-9W for submit@debbugs.gnu.org; Mon, 04 Dec 2017 22:16:07 -0500 Received: from aserp2120.oracle.com ([141.146.126.78]:41750) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eM3iH-0004VS-8E for 24914@debbugs.gnu.org; Mon, 04 Dec 2017 22:16:05 -0500 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.21/8.16.0.21) with SMTP id vB53C6kO005885; Tue, 5 Dec 2017 03:15:59 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=mime-version : message-id : date : from : sender : to : cc : subject : references : in-reply-to : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=02BZPAQ4+wjTGtevqLhseuToEfmXYHsFK37QgmKv6Ng=; b=CLLTtWyH0dCPMZTM9i5xufMhJFD9c9JS2Clfv+rZf6m1qGm6GNZWcCfcZmA3eUFZ2RYU NdA5BvHiZW/tL959/aOIKEabrZKNMYcGpetGHh5wb8uDG0IWsmmtzCQ0BCapRixXN9E8 qYcgqF0RApZxe5fnrNS+E1Xz0HA3KIBGDGE4057j9WYZ594AAKK+2Sxw5UVjXH/kZMRV mmb2yjf5L3ZFJEutUQ8o2vJqc461z48d4u+IwMoyvB0onl9Mt0apTnFo8ETMWcgo+jr+ WJa9ua4UKwOSmE3eA88KWPsrXPhDFd3AkhCMiv8+IzXTIFGCBlKf/9BH2IUs7XvSdplb LQ== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp2120.oracle.com with ESMTP id 2enc2g11pr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 05 Dec 2017 03:15:58 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id vB53FvOC023358 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 5 Dec 2017 03:15:58 GMT Received: from abhmp0018.oracle.com (abhmp0018.oracle.com [141.146.116.24]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id vB53FvKf005701; Tue, 5 Dec 2017 03:15:57 GMT MIME-Version: 1.0 Message-ID: Date: Mon, 4 Dec 2017 19:15:42 -0800 (PST) From: Drew Adams To: Noam Postavsky Subject: RE: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> <87shcrgg8g.fsf@users.sourceforge.net> <87h8t6gegl.fsf@users.sourceforge.net> In-Reply-To: <87h8t6gegl.fsf@users.sourceforge.net> X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9.1 (1003210) [OL 16.0.4615.0 (x86)] Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8735 signatures=668637 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1712050045 X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 24914 Cc: 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) > > What I cited are cases where we do flag _particular kinds_ > > of invalid regexps, and so tailor the error msg. >=20 > I'm not sure if you're citing actual code we have right now, or just > some hypotheticals. In isearch.el, we pretty much do the opposite of > tailor the error message. I was citing what I thought were such cases in the current isearch.el code - cases where we do not just say "Invalid Regexp". We say things like this: Too many words Too many spaces for whitespace matching Unmatched [ or [^ Granted, the last is used only in `isearch-query-replace. My point was that in some existing cases (not many, admittedly), we do try to give a more precise error message when signal `invalid-regexp' is detected. But I'm not sure what you're arguing, if you are arguing. Certainly we don't tailor the message _much_ for the kind of `invalid-regexp'. But we do make some effort to do that, even now, AFAICT. > >> Actually, I'm wondering what's the point of isearch showing > >> "incomplete" instead of the actual regexp invalid error. > >> I.e., why not instead of > >> > >> \ [incomplete] > >> \{ [incomplete] > >> \{4 [incomplete] > >> \{4000 [incomplete] > >> \{4000\ [incomplete] > >> \{4000\} > >> > >> show this: > >> > >> \ [Trailing backslash] > >> \{ [Unmatched \{] > >> \{4 [Unmatched \{] > >> \{4000 [Unmatched \{] > >> \{4000\ [Trailing backslash] > >> \{4000\} >=20 > I meant just the following patch, you can try it out easily: > (invalid-regexp > (setq isearch-error (car (cdr lossage))) > (cond > - ((string-match > -=09"\\`Premature \\|\\`Unmatched \\|\\`Invalid " > -=09isearch-error) > - (setq isearch-error "incomplete input")) > ((and (not isearch-regexp) > =09 (string-match "\\`Regular expression too big" isearch-error)) > (cond You mean show "[Invalid content of \{\}]" in all cases? _Never_ show "[incomplete input]"? Why would that be better? Anyway, I don't have a strong opinion about that. I do think that in the case reported it's too bad that we say "[incomplete input]". But I don't think it follows that it would be more helpful to most users to show the invalid-regexp description in cases where Emacs can really tell that the input is necessarily incomplete. I suspect that it is quite common for that "incomplete input" message to be helpful. From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 04 22:52:02 2017 Received: (at 24914) by debbugs.gnu.org; 5 Dec 2017 03:52:02 +0000 Received: from localhost ([127.0.0.1]:46238 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eM4H4-0005LZ-3U for submit@debbugs.gnu.org; Mon, 04 Dec 2017 22:52:02 -0500 Received: from mail-it0-f51.google.com ([209.85.214.51]:38985) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eM4H2-0005L7-5S for 24914@debbugs.gnu.org; Mon, 04 Dec 2017 22:52:00 -0500 Received: by mail-it0-f51.google.com with SMTP id 68so12016612ite.4 for <24914@debbugs.gnu.org>; Mon, 04 Dec 2017 19:52:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=ynpsIQRrviOLooS2MBdRcvP2D74xe3UcQZEklNZSDtI=; b=CHsmYfsKV5+IbM25wA3McNqdi/vBG0/CWoHtCFl268m7RjMa4Jn+e8uCG90IqmUqJT wZYnoS+mIwtlNi/cB/8Elf7yj9KcCMDgVnR8ZKdDJoGYD0fDwRrDeXtphgmDiNDBPwN2 27zkZid0wy+vbCgQ6uFA84NsduaZOPXIhc9IMsdMp1nY2G+miWJQol7KxUmrXvDB8rjc AcT+9rLB6GuF/snftojpVdRtK0mbNg2Fogq4UasaUYzq45ArrH5hpdzMXwblq6Rp8LL0 VJP8OHu9sLRXM4u2zjfy9VvlV3b8SiE7VHFYA8u/8W2hsloBlGJqivvbwSzhS/1r/p7K TkcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:references:date :in-reply-to:message-id:user-agent:mime-version; bh=ynpsIQRrviOLooS2MBdRcvP2D74xe3UcQZEklNZSDtI=; b=P/sCX3j5EhzIE1n0YFJVjdJRyzNuJcTz+ebhv2IyeIAPJ4lOak1ZPb/5I4Gttqag5S Uisve04CqQQHKb/7xNgtVlk5xPR2NpmS+OHvKoXmNt+/kCyJXjIVeKJNYlCNAgk5mMWs L2/YpGPFrKnrtZ7XEd9KsXDDdp/Z+4PcQzRTgaaBUjNcxoXh0OC7zwjCP/usxIp7hKsY bJH8SDrVTgdk5BSGjCyneTH8pIbDT6dJEQhVVPpEZA3ZVcrAop4vXHYXpR/HOEqHER7m zbu0BXZ/v5LBrQP94Bhn7lXWhk3OpMj35WhZeq8rCBwDUubYXMwY9mDyw2DPq0rizg+j mYUg== X-Gm-Message-State: AKGB3mJ0/9FcUUdtF7Jd1SXQSZrZWEnexXW9twubSGkqYhYFsEdeklNP Nq1czPVBICVflftK6eQvFsuArA== X-Google-Smtp-Source: AGs4zMbm/RdXPpv7VYniNpvtTZf0M2rizwKeWHrV+x2jPGxuuKzMy97Tu5qsoYBWzFSj3Q06Q6na4Q== X-Received: by 10.107.16.86 with SMTP id y83mr18640453ioi.107.1512445914130; Mon, 04 Dec 2017 19:51:54 -0800 (PST) Received: from zebian ([45.2.119.34]) by smtp.googlemail.com with ESMTPSA id l4sm6498641ioc.69.2017.12.04.19.51.52 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 04 Dec 2017 19:51:53 -0800 (PST) From: Noam Postavsky To: Drew Adams Subject: Re: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> <87shcrgg8g.fsf@users.sourceforge.net> <87h8t6gegl.fsf@users.sourceforge.net> Date: Mon, 04 Dec 2017 22:51:51 -0500 In-Reply-To: (Drew Adams's message of "Mon, 4 Dec 2017 19:15:42 -0800 (PST)") Message-ID: <878tehhlwo.fsf@users.sourceforge.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.90 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.2 (/) X-Debbugs-Envelope-To: 24914 Cc: 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.2 (/) Drew Adams writes: > I was citing what I thought were such cases in the current > isearch.el code - cases where we do not just say "Invalid > Regexp". We say things like this: > > Too many words > Too many spaces for whitespace matching In that case, the regexp is constructed by Emacs on behalf of the user, so it makes sense to translate the error message so that it matches the user operation. > Unmatched [ or [^ AFAICT, the isearch.el code doesn't write that message, but rather reads it. > Granted, the last is used only in `isearch-query-replace. > > My point was that in some existing cases (not many, > admittedly), we do try to give a more precise error > message when signal `invalid-regexp' is detected. > > But I'm not sure what you're arguing, if you are arguing. The C regex code produces the following error messages when parsing regexps: { gettext_noop ("Success"), /* REG_NOERROR */ gettext_noop ("No match"), /* REG_NOMATCH */ gettext_noop ("Invalid regular expression"), /* REG_BADPAT */ gettext_noop ("Invalid collation character"), /* REG_ECOLLATE */ gettext_noop ("Invalid character class name"), /* REG_ECTYPE */ gettext_noop ("Trailing backslash"), /* REG_EESCAPE */ gettext_noop ("Invalid back reference"), /* REG_ESUBREG */ gettext_noop ("Unmatched [ or [^"), /* REG_EBRACK */ gettext_noop ("Unmatched ( or \\("), /* REG_EPAREN */ gettext_noop ("Unmatched \\{"), /* REG_EBRACE */ gettext_noop ("Invalid content of \\{\\}"), /* REG_BADBR */ gettext_noop ("Invalid range end"), /* REG_ERANGE */ gettext_noop ("Memory exhausted"), /* REG_ESPACE */ gettext_noop ("Invalid preceding regular expression"), /* REG_BADRPT */ gettext_noop ("Premature end of regular expression"), /* REG_EEND */ gettext_noop ("Regular expression too big"), /* REG_ESIZE */ gettext_noop ("Unmatched ) or \\)"), /* REG_ERPAREN */ gettext_noop ("Range striding over charsets"), /* REG_ERANGEX */ }; When doing isearch-*-regexp, most of those errors are converted into "incomplete" (i.e., *less* precise). But I think it would be more helpful to show the original error message. > But I don't think it follows that it would be more helpful to > most users to show the invalid-regexp description in cases > where Emacs can really tell that the input is necessarily > incomplete. I suspect that it is quite common for that > "incomplete input" message to be helpful. How does it help (compared to the more precise message)? Seems to me that telling the user they haven't finished entering the regexp isn't especially helpful; surely the user already knows they haven't finished typing yet. From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 04 23:53:06 2017 Received: (at 24914) by debbugs.gnu.org; 5 Dec 2017 04:53:06 +0000 Received: from localhost ([127.0.0.1]:46247 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eM5E8-0006gk-UU for submit@debbugs.gnu.org; Mon, 04 Dec 2017 23:53:06 -0500 Received: from aserp2130.oracle.com ([141.146.126.79]:46885) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eM5E6-0006gG-Nr for 24914@debbugs.gnu.org; Mon, 04 Dec 2017 23:53:03 -0500 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.21/8.16.0.21) with SMTP id vB54pqtb101522; Tue, 5 Dec 2017 04:52:53 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=mime-version : message-id : date : from : sender : to : cc : subject : references : in-reply-to : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=h5dKfNyQNTd7IWFoX2Wl2qmM/YvYDJgD60QKgpK3WKE=; b=Q8HmxIZuENZZSQrhDE3y1FpDA4ZGFXewYCHezUTTbFAu9U2kG+XjcpFRyBov0oCVYTpN aOSXIN9D2ToLA14pZK7SURFATYblBYL8RxmjHh2OZ1RXGmh5Qo/WeIfBPE/wdZa+rgDV ms263FjLxrW8hHuVAZPlwx/cBSrnr/FQqpYaPPIWVo4Uo5IOvh1EoCG91WkB3E1S1hf0 +HDF/j5O2RkCPMR8zSJv3DsqM5+eu9ND47kunBzNCJvlJRBUzzZkoPGpeXRU6gz41ZBT YxTojBEbHCCBAonj7W5m2JDtq1yGHmkfsIHmWlQ52p2+UgzUV0CpJQGzlXd00LWu3hlH sw== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp2130.oracle.com with ESMTP id 2en9pgsqtq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 05 Dec 2017 04:52:53 +0000 Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id vB54qq95018374 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 5 Dec 2017 04:52:52 GMT Received: from abhmp0016.oracle.com (abhmp0016.oracle.com [141.146.116.22]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id vB54qpZk002325; Tue, 5 Dec 2017 04:52:52 GMT MIME-Version: 1.0 Message-ID: <3a58fdaf-10c0-42e6-8c74-753ce24b969e@default> Date: Mon, 4 Dec 2017 20:52:50 -0800 (PST) From: Drew Adams To: Noam Postavsky Subject: RE: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> <87shcrgg8g.fsf@users.sourceforge.net> <87h8t6gegl.fsf@users.sourceforge.net> <878tehhlwo.fsf@users.sourceforge.net> In-Reply-To: <878tehhlwo.fsf@users.sourceforge.net> X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9.1 (1003210) [OL 16.0.4615.0 (x86)] Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8735 signatures=668637 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1712050073 X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 24914 Cc: 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) > gettext_noop ("Success"),=09/* REG_NOERROR */ > gettext_noop ("No match"),=09/* REG_NOMATCH */ > gettext_noop ("Invalid regular expression"), /* REG_BADPAT */ > gettext_noop ("Invalid collation character"), /* REG_ECOLLATE */ > gettext_noop ("Invalid character class name"), /* REG_ECTYPE */ > gettext_noop ("Trailing backslash"), /* REG_EESCAPE */ > gettext_noop ("Invalid back reference"), /* REG_ESUBREG */ > gettext_noop ("Unmatched [ or [^"),=09/* REG_EBRACK */ > gettext_noop ("Unmatched ( or \\("), /* REG_EPAREN */ > gettext_noop ("Unmatched \\{"), /* REG_EBRACE */ > gettext_noop ("Invalid content of \\{\\}"), /* REG_BADBR */ > gettext_noop ("Invalid range end"),=09/* REG_ERANGE */ > gettext_noop ("Memory exhausted"), /* REG_ESPACE */ > gettext_noop ("Invalid preceding regular expression"), /* REG_BADRPT > */ > gettext_noop ("Premature end of regular expression"), /* REG_EEND */ > gettext_noop ("Regular expression too big"), /* REG_ESIZE */ > gettext_noop ("Unmatched ) or \\)"), /* REG_ERPAREN */ > gettext_noop ("Range striding over charsets"), /* REG_ERANGEX */ >=20 > When doing isearch-*-regexp, most of those errors are converted into > "incomplete" (i.e., *less* precise). But I think it would be more > helpful to show the original error message. Agreed. Unless Emacs can be sure that in some context one of them really means that the regexp is not complete. If the user can just keep typing to make a valid regexp then an error msg (premature, not yet warranted) typically hurts more than it helps, I think. But if Emacs can't tell that, than sure, why not? Timing can mean a lot also (but depends on the user and how much thinking is going on). It's not great to interrupt immediately with an error msg if the user was still typing. And clearly some of those error conditions do _not_ end up translated as "incomplete input" messages - or they should not, in any case. Clearly someone made a decision about "Trailing backslash", for instance, and it's a very good decision IMO. That's a more useful "the-pattern-is-incomplete" message than just saying "incomplete input". We are not the first to consider the list of error conditions and what to do about this one or that one. That doesn't imply that the judgment made previously is the best one. It does suggest perhaps consulting those who might have made it, or the larger emacs-devel community. The behavior could be completely one-sided one way or the other way. Or it could be any kind of mix in between. It is currently a mix, but obviously not a perfect one - hence this bug report. Which tradeoff is best? > > But I don't think it follows that it would be more helpful to > > most users to show the invalid-regexp description in cases > > where Emacs can really tell that the input is necessarily > > incomplete. I suspect that it is quite common for that > > "incomplete input" message to be helpful. >=20 > How does it help (compared to the more precise message)? See above. Isearch is incremental: you don't just enter a complete regexp once and for all (as in `grep', for instance. If Emacs jumps the gun with a premature "error" msg, that can be annoying, no? > Seems to me > that telling the user they haven't finished entering the regexp isn't > especially helpful; surely the user already knows they haven't finished > typing yet. No, _not_ surely - I think. Many (most? maybe, maybe not) users are not all that positive about using Emacs regexps. They should be able to interact with Isearch regexps interactively, incrementally. And yes, I think that it can be helpful to let a user know that the current pattern is incomplete as a regexp. But hey, users are different. I'd argue that we could add an option, with the default setting the current behavior (as I expect it is those less experienced that the "incomplete" behavior could benefit the most, and those more experience who could benefit most from the specific "invalid" messages). The latter are probably also the ones most likely to try more complicated regexps, which benefit the most, I expect, from precise "invalid" messages. Adding the behavior you propose as an option shouldn't hurt. But even for that you might want to propose it at emacs-devel. There might be people there more familiar with different use cases or who know more about the history of why the current behavior is as it is. Just a suggestion. From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 05 08:27:59 2017 Received: (at 24914) by debbugs.gnu.org; 5 Dec 2017 13:27:59 +0000 Received: from localhost ([127.0.0.1]:46551 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eMDGQ-0007h7-QV for submit@debbugs.gnu.org; Tue, 05 Dec 2017 08:27:59 -0500 Received: from mail-it0-f48.google.com ([209.85.214.48]:39089) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eMDGO-0007gt-Ke for 24914@debbugs.gnu.org; Tue, 05 Dec 2017 08:27:56 -0500 Received: by mail-it0-f48.google.com with SMTP id 68so1436366ite.4 for <24914@debbugs.gnu.org>; Tue, 05 Dec 2017 05:27:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=zsldws2n5GjXiNnmlwOVk4sjn7i9Ovh4RbGNHcOowtc=; b=iJtI6JYjmyTJhuRctRD5q0S9BnGlXx70Hw9lpNtspdoebBRILwyFNSE2auH3sZK1cN fPvXcHOFJEn1HQfgZ4QBDd/Bb8wJjrdGYZFmN/HNrojCCcvQDUPh9UBeUk1VKtHP38g4 J8MUswq5C6TPtXrVnb+aN/sT0BaNitUGOLVJJfZC+IJ56F4p3DBt4dhZCP+Lq5DJHEkY JXHfU+Da1LF50c0EsY6r3Fw9Jo08W6367ghTgrM7PKH8H+U5e46eyq6Iemn4i3gGrKPf AdrBvD8p3PQlt/f1IHfmtheSYzna8pxnuQ0yI8VX04j6mDX1Rqbp5nnb3MNtzhXH1Soh rtDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:references:date :in-reply-to:message-id:user-agent:mime-version; bh=zsldws2n5GjXiNnmlwOVk4sjn7i9Ovh4RbGNHcOowtc=; b=IpcUcbB3PMg+DZmaKYp0W4m1NHth0PScYwU/2M1vOWrc5+Th7wLBh5lxuujK6b6DFD q1ff3NK6ERZo2bqQXnKxPaYR36Ht1hvBp3ehNjH33laXbxn0VS97TRnAIlp7csWce/35 C0xxKkrF2ZrqozBe0/+nGlBbiLBTI9QjeCKzbkr6bqrPDtRfM04EvOrsSwqKAi6PGlC3 JjJcv+Tw/Q7kAN51whTfcaiFJ2gwjAWurTfAdnd50zHxmcm3HuLZ87Ti0pkxx0d4Vmt8 2tQX0RbvsWaqPauyTmJfnoOBKctT5rwoiIxymhGi8vXhv7Nq0I4QN/7bXzXAbn2FZGtn gDhg== X-Gm-Message-State: AKGB3mKFlvvYqPABCv7jW2ReJ5I/C28QqvjRcySSwXouqQ0k3tZdtptA Igwo359VFV+Xa8E/4ivI490K1A== X-Google-Smtp-Source: AGs4zMbDAig3aHYdQak7TT7orjGgwUMDCmB4C7P0yoAbrCvn5E6MXKvmNa9ezXLgllkey5Oryklu6Q== X-Received: by 10.107.46.92 with SMTP id i89mr14054932ioo.8.1512480470610; Tue, 05 Dec 2017 05:27:50 -0800 (PST) Received: from zebian ([45.2.119.34]) by smtp.googlemail.com with ESMTPSA id v19sm226787ite.4.2017.12.05.05.27.48 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 05 Dec 2017 05:27:49 -0800 (PST) From: Noam Postavsky To: Drew Adams Subject: Re: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> <87shcrgg8g.fsf@users.sourceforge.net> <87h8t6gegl.fsf@users.sourceforge.net> <878tehhlwo.fsf@users.sourceforge.net> <3a58fdaf-10c0-42e6-8c74-753ce24b969e@default> Date: Tue, 05 Dec 2017 08:27:48 -0500 In-Reply-To: <3a58fdaf-10c0-42e6-8c74-753ce24b969e@default> (Drew Adams's message of "Mon, 4 Dec 2017 20:52:50 -0800 (PST)") Message-ID: <87609lgv8r.fsf@users.sourceforge.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.90 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 24914 Cc: 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.5 (/) Drew Adams writes: > Clearly someone made a decision about "Trailing backslash", > for instance, and it's a very good decision IMO. That's a > more useful "the-pattern-is-incomplete" message than just > saying "incomplete input". Right, but I can't see why the same shouldn't apply to "Unmatched \\{" and all the others. > We are not the first to consider the list of error > conditions and what to do about this one or that one. > That doesn't imply that the judgment made previously is > the best one. It does suggest perhaps consulting those > who might have made it, or the larger emacs-devel community. That code seems to have been there since 1992 "Initial revision", so it's not clear what, if any, considerations were made. > See above. Isearch is incremental: you don't just enter > a complete regexp once and for all (as in `grep', for > instance. If Emacs jumps the gun with a premature "error" > msg, that can be annoying, no? We already get an "error" message, it says "incomplete". The question is why shouldn't it instead say *why* it's incomplete. > Adding the behavior you propose as an option shouldn't > hurt. It hurts, because it adds yet another option, which makes a user's job of going through them and deciding which ones make sense that much harder (yes, just this particular addded option won't make much difference, but still). >There might be people there more familiar with different use cases or >who know more about the history of why the current behavior is as it >is. I hope so. From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 05 10:31:33 2017 Received: (at 24914) by debbugs.gnu.org; 5 Dec 2017 15:31:33 +0000 Received: from localhost ([127.0.0.1]:48044 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eMFC1-0004NE-2e for submit@debbugs.gnu.org; Tue, 05 Dec 2017 10:31:33 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:24813) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eMFBz-0004N1-42 for 24914@debbugs.gnu.org; Tue, 05 Dec 2017 10:31:31 -0500 Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id vB5FVOJp002460 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 5 Dec 2017 15:31:25 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id vB5FVN6u030458 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 5 Dec 2017 15:31:24 GMT Received: from abhmp0016.oracle.com (abhmp0016.oracle.com [141.146.116.22]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id vB5FVNuM017712; Tue, 5 Dec 2017 15:31:23 GMT MIME-Version: 1.0 Message-ID: Date: Tue, 5 Dec 2017 07:31:21 -0800 (PST) From: Drew Adams To: Noam Postavsky Subject: RE: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> <87shcrgg8g.fsf@users.sourceforge.net> <87h8t6gegl.fsf@users.sourceforge.net> <878tehhlwo.fsf@users.sourceforge.net> <3a58fdaf-10c0-42e6-8c74-753ce24b969e@default> <87609lgv8r.fsf@users.sourceforge.net> In-Reply-To: <87609lgv8r.fsf@users.sourceforge.net> X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.9.1 (1003210) [OL 16.0.4615.0 (x86)] Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Source-IP: userv0021.oracle.com [156.151.31.71] X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 24914 Cc: 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) > > Clearly someone made a decision about "Trailing backslash", > > for instance, and it's a very good decision IMO. That's a > > more useful "the-pattern-is-incomplete" message than just > > saying "incomplete input". >=20 > Right, but I can't see why the same shouldn't apply to > "Unmatched \\{" and all the others. Treating them all the same is one possibility. Not the best one, I think, but one possibility. > > We are not the first to consider the list of error > > conditions and what to do about this one or that one. > > That doesn't imply that the judgment made previously is > > the best one. It does suggest perhaps consulting those > > who might have made it, or the larger emacs-devel community. >=20 > That code seems to have been there since 1992 "Initial > revision", so it's not clear what, if any, considerations > were made. It might not be clear, but that doesn't mean there weren't good reasons for that judgment. (And no, I'm not saying that a different judgment can't be made now.) And certainly some of those around then, including deciders probably, are still around today. Perhaps RMS has an opinion or recollection about this? > > See above. Isearch is incremental: you don't just enter > > a complete regexp once and for all (as in `grep', for > > instance. If Emacs jumps the gun with a premature "error" > > msg, that can be annoying, no? >=20 > We already get an "error" message, it says "incomplete". > The question is why shouldn't it instead say *why* it's > incomplete. I thought your proposal was to, in all cases, eliminate saying it is incomplete in favor of the specific regexp-invalidity error text. Such error text does not, generally and directly, tell users that the input is incomplete. Users very familiar with regexps might understand that such a msg implies that input is incomplete, but not everyone will get that. > > Adding the behavior you propose as an option shouldn't > > hurt. >=20 > It hurts, because it adds yet another option, which makes a user's job > of going through them and deciding which ones make sense that much > harder (yes, just this particular addded option won't make much > difference, but still). Users who are very familiar with Emacs regexps will be the ones to benefit most from the specific regexp-validity msgs, IMO. They should have no problem customizing an option. Users unfamiliar with Emacs or Emacs regexps will get the simpler default behavior: your input pattern is not yet complete. If you feel strongly about this and are opposed to adding a user option, consider proposing the change you want to emacs-devel. This particular bug is about one case: just the particular "incomplete input" message case cited. Fixing this bug shouldn't require changing the basic behavior, though that is certainly one possibility. You apparently think there is never any value in telling users that the input pattern is not complete as a regexp. I disagree. We apparently agree that at least in some cases the specific regexp-invalidity message is more helpful. There is a spectrum of possibilities here. I see no special reason why the right approach should be all or none. > > There might be people there more familiar with different > > use cases or who know more about the history of why the > > current behavior is as it is. >=20 > I hope so. From debbugs-submit-bounces@debbugs.gnu.org Tue Dec 05 21:53:05 2017 Received: (at 24914) by debbugs.gnu.org; 6 Dec 2017 02:53:05 +0000 Received: from localhost ([127.0.0.1]:48405 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eMPpY-0005EJ-HK for submit@debbugs.gnu.org; Tue, 05 Dec 2017 21:53:05 -0500 Received: from mail-it0-f51.google.com ([209.85.214.51]:44204) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eMPpW-0005Dk-KE; Tue, 05 Dec 2017 21:53:03 -0500 Received: by mail-it0-f51.google.com with SMTP id b5so5493717itc.3; Tue, 05 Dec 2017 18:53:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=DlwAQxIZGXEwKS02bWX0rnAyf5+nuRLM268qcPnzulU=; b=f0P7vBjYw6Ga1T1qTduNepGPPpXDiVcNxuuZKe3jvfDwxced3Mey/+g4pq2pqIDPtm FX8O390ApyUJJkSF7JnMe5ha+6HiGMif92VB8eqy7ThUudU4q5Drqv5AG9OfRzln6vpA /Tj7oUQ6WjWRK5AiWuMtfAtKWnwKlGXBe/X+K6qwRyUEFVzU+r3wJPC7i5YBS3a0JBf5 VGjb+WiHhacINoNuoBsl82KgDYjyvRkG4TsKELMYSCKmxE9CVsps2qgcGfM1Zbfr21y5 phXwzItnGffxR4PUCuMEJ/NWf3xBh2FJToeaKaw99JNx0CtMrVAhXK+T7ZA4VQooc4p6 h4jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:references:date :in-reply-to:message-id:user-agent:mime-version; bh=DlwAQxIZGXEwKS02bWX0rnAyf5+nuRLM268qcPnzulU=; b=auZYHkYbkhatmYwrMr1+eTKJ540B0yFg8PZ72VWOzKcJ66UNmAHa2QPaDJ+XQ8UgI4 Z+Xuj5zz0yKPpsboCMeLivFAaPoOq42a6mVx8PEcCLz+XvrWjDlh5oUQCgyN559fd66E Yht8Z3OQXxE6cjrwa0roC4n5qyiB/+YLi/rnxbx5CqOh81vbcHwjP3pVvYTHDxqIB//m r1ggvZVAxYU9DvNoLSw1iJhuRSwBfhDYxa5kM6/GxDXMGirEXNWkJ48ctvUXTuNxsN9p walx/lntJAVgu2yZZehe6q5nkIfrUyt20pK2GM/Zf2F517Uua0VRJu5hLNIk31pmw+gT 7vBw== X-Gm-Message-State: AKGB3mLOTCzzOSzCwfFp43I4IOxtiLlP/FU03uezZBze0gX/qFSMcwcj ne8QIk+4foi0NlW//HU9FPMQ8Q== X-Google-Smtp-Source: AGs4zMbe2TVlwMDdfrRYFxFOd3LOAwCIEgoKAhRT3Cj+AU8Gss6C/0u3fLKUQewFSrBItJnEwrPDyw== X-Received: by 10.36.58.12 with SMTP id m12mr11225562itm.17.1512528776468; Tue, 05 Dec 2017 18:52:56 -0800 (PST) Received: from zebian ([45.2.119.34]) by smtp.googlemail.com with ESMTPSA id 143sm716505ioo.31.2017.12.05.18.52.53 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 05 Dec 2017 18:52:54 -0800 (PST) From: Noam Postavsky To: Drew Adams Subject: Re: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> <87shcrgg8g.fsf@users.sourceforge.net> <87h8t6gegl.fsf@users.sourceforge.net> <878tehhlwo.fsf@users.sourceforge.net> <3a58fdaf-10c0-42e6-8c74-753ce24b969e@default> <87609lgv8r.fsf@users.sourceforge.net> Date: Tue, 05 Dec 2017 21:52:52 -0500 In-Reply-To: (Drew Adams's message of "Tue, 5 Dec 2017 07:31:21 -0800 (PST)") Message-ID: <871sk8h8jf.fsf@users.sourceforge.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.90 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 24914 Cc: 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.5 (/) --=-=-= Content-Type: text/plain tags 24914 + patch quit Drew Adams writes: > Such error text does not, generally and directly, tell > users that the input is incomplete. Users very familiar > with regexps might understand that such a msg implies > that input is incomplete, but not everyone will get that. Hmm, I hadn't considered that possibility, but I will allow that *could* be a symptom of my being overly familiar with regexp syntax. > You apparently think there is never any value in > telling users that the input pattern is not > complete as a regexp. I disagree. We apparently > agree that at least in some cases the specific > regexp-invalidity message is more helpful. Okay, I've looked at the error messages a bit more closely, and I believe all the "Invalid ..." ones should never be considered "incomplete". See commit message for details. --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=0001-Raise-limit-of-regexp-repetition-Bug-24914.patch Content-Description: patch >From 1d32f4d28521a143c333ef4cc125419661e3a3a9 Mon Sep 17 00:00:00 2001 From: Noam Postavsky Date: Sat, 2 Dec 2017 19:01:54 -0500 Subject: [PATCH] Raise limit of regexp repetition (Bug#24914) * src/regex.h (RE_DUP_MAX): Raise limit to 2^32-1. * etc/NEWS: Announce it. * doc/lispref/searching.texi (Regexp Backslash): Document it. * src/regex.h (reg_errcode_t): Add REG_ESIZEBR code. * src/regex.c (re_error_msgid): Add corresponding entry. (GET_INTERVAL_COUNT): Return it instead of the more generic REG_EBADBR when encountering a repetition greater than RE_DUP_MAX. * lisp/isearch.el (isearch-search): Don't convert errors starting with "Invalid" into "incomplete". Such errors are not incomplete, in the sense that they cannot be corrected by appending more characters to the end of the regexp. The affected error messages are: - REG_BADPAT "Invalid regular expression" - \\(?X:\\) where X is not a legal group number - \\_X where X is not < or > - REG_ECOLLATE "Invalid collation character" - There is no code to throw this. - REG_ECTYPE "Invalid character class name" - [[:foo:] where foo is not a valid class name - REG_ESUBREG "Invalid back reference" - \N where N is referenced before matching group N - REG_BADBR "Invalid content of \\{\\}" - \\{N,M\\} where N < 0, M < N, M or N larger than max - \\{NX where X is not a digit or backslash - \\{N\\X where X is not a } - REG_ERANGE "Invalid range end" - There is no code to throw this. - REG_BADRPT "Invalid preceding regular expression" - We never throw this. It would usually indicate a "*" with no preceding regexp text, but Emacs allows that to match a literal "*". --- doc/lispref/searching.texi | 9 ++++++++- etc/NEWS | 8 ++++++++ lisp/isearch.el | 2 +- src/regex.c | 5 +++-- src/regex.h | 9 ++++++--- 5 files changed, 26 insertions(+), 7 deletions(-) diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi index 755fa554bb..724d66b5e3 100644 --- a/doc/lispref/searching.texi +++ b/doc/lispref/searching.texi @@ -639,7 +639,14 @@ Regexp Backslash is a more general postfix operator that specifies repetition with a minimum of @var{m} repeats and a maximum of @var{n} repeats. If @var{m} is omitted, the minimum is 0; if @var{n} is omitted, there is no -maximum. +maximum. For both forms, @var{m} and @var{n}, if specified, may be no +larger than +@ifnottex +2**31 @minus{} 1 +@end ifnottex +@tex +@math{2^{31}-1} +@end tex For example, @samp{c[ad]\@{1,2\@}r} matches the strings @samp{car}, @samp{cdr}, @samp{caar}, @samp{cadr}, @samp{cdar}, and @samp{cddr}, and diff --git a/etc/NEWS b/etc/NEWS index 4ccf468693..579cad058e 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -509,6 +509,14 @@ instead. ** The new user option 'arabic-shaper-ZWNJ-handling' controls how to handle ZWNJ in Arabic text rendering. ++++ +** The limit on repetitions in regexps has been raised to 2^31-1. +It was previously undocumented and limited to 2^15-1. For example, +the following regular expression was previously invalid, but is now +accepted: + + x\{32768\} + * Editing Changes in Emacs 26.1 diff --git a/lisp/isearch.el b/lisp/isearch.el index 13fa97ea71..093185a096 100644 --- a/lisp/isearch.el +++ b/lisp/isearch.el @@ -2851,7 +2851,7 @@ isearch-search (setq isearch-error (car (cdr lossage))) (cond ((string-match - "\\`Premature \\|\\`Unmatched \\|\\`Invalid " + "\\`Premature \\|\\`Unmatched " isearch-error) (setq isearch-error "incomplete input")) ((and (not isearch-regexp) diff --git a/src/regex.c b/src/regex.c index 330f2f78a8..ab74f457d4 100644 --- a/src/regex.c +++ b/src/regex.c @@ -1200,7 +1200,8 @@ WEAK_ALIAS (__re_set_syntax, re_set_syntax) gettext_noop ("Premature end of regular expression"), /* REG_EEND */ gettext_noop ("Regular expression too big"), /* REG_ESIZE */ gettext_noop ("Unmatched ) or \\)"), /* REG_ERPAREN */ - gettext_noop ("Range striding over charsets") /* REG_ERANGEX */ + gettext_noop ("Range striding over charsets"), /* REG_ERANGEX */ + gettext_noop ("Invalid content of \\{\\}, repetitions too big") /* REG_ESIZEBR */ }; /* Whether to allocate memory during matching. */ @@ -1921,7 +1922,7 @@ while (REMAINING_AVAIL_SLOTS <= space) { \ if (num < 0) \ num = 0; \ if (RE_DUP_MAX / 10 - (RE_DUP_MAX % 10 < c - '0') < num) \ - FREE_STACK_RETURN (REG_BADBR); \ + FREE_STACK_RETURN (REG_ESIZEBR); \ num = num * 10 + c - '0'; \ if (p == pend) \ FREE_STACK_RETURN (REG_EBRACE); \ diff --git a/src/regex.h b/src/regex.h index 9fa8356011..b829848586 100644 --- a/src/regex.h +++ b/src/regex.h @@ -270,8 +270,10 @@ #ifdef RE_DUP_MAX # undef RE_DUP_MAX #endif -/* If sizeof(int) == 2, then ((1 << 15) - 1) overflows. */ -#define RE_DUP_MAX (0x7fff) +/* If sizeof(int) == 4, then ((1 << 31) - 1) overflows. This used to + be limited to 0x7fff, but Emacs never supported 16 bit platforms + anyway. */ +#define RE_DUP_MAX (0x7fffffff) /* POSIX `cflags' bits (i.e., information for `regcomp'). */ @@ -337,7 +339,8 @@ REG_EEND, /* Premature end. */ REG_ESIZE, /* Compiled pattern bigger than 2^16 bytes. */ REG_ERPAREN, /* Unmatched ) or \); not returned from regcomp. */ - REG_ERANGEX /* Range striding over charsets. */ + REG_ERANGEX, /* Range striding over charsets. */ + REG_ESIZEBR /* n or m too big in \{n,m\} */ } reg_errcode_t; /* This data structure represents a compiled pattern. Before calling -- 2.11.0 --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 08 04:48:58 2017 Received: (at 24914) by debbugs.gnu.org; 8 Dec 2017 09:48:58 +0000 Received: from localhost ([127.0.0.1]:51366 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eNFH7-000070-Sx for submit@debbugs.gnu.org; Fri, 08 Dec 2017 04:48:58 -0500 Received: from eggs.gnu.org ([208.118.235.92]:38031) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eNFH6-00006p-Pi for 24914@debbugs.gnu.org; Fri, 08 Dec 2017 04:48:57 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eNFGy-0006ss-LF for 24914@debbugs.gnu.org; Fri, 08 Dec 2017 04:48:51 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,T_RP_MATCHES_RCVD, URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:53112) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eNFGy-0006sm-Hd; Fri, 08 Dec 2017 04:48:48 -0500 Received: from [176.228.60.248] (port=2720 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1eNFGw-0005Yp-OJ; Fri, 08 Dec 2017 04:48:47 -0500 Date: Fri, 08 Dec 2017 11:48:26 +0200 Message-Id: <83wp1xwnx1.fsf@gnu.org> From: Eli Zaretskii To: Noam Postavsky In-reply-to: <87h8t6gegl.fsf@users.sourceforge.net> (message from Noam Postavsky on Mon, 04 Dec 2017 20:18:02 -0500) Subject: Re: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> <87shcrgg8g.fsf@users.sourceforge.net> <87h8t6gegl.fsf@users.sourceforge.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 24914 Cc: drew.adams@oracle.com, 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > From: Noam Postavsky > Date: Mon, 04 Dec 2017 20:18:02 -0500 > Cc: 24914@debbugs.gnu.org > > > Emacs never did (the MS-DOS port of Emacs runs in i386 32-bit > > protected mode on top of a 16-bit OS). But regex.c did, at some very > > distant past, to support the 16-bit MS compiler, or at least it tried > > to. > > So changing to 2^31 as the max should be fine, right? Or maybe just INT_MAX? From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 08 08:32:53 2017 Received: (at 24914) by debbugs.gnu.org; 8 Dec 2017 13:32:53 +0000 Received: from localhost ([127.0.0.1]:51480 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eNIlp-0007Ce-1V for submit@debbugs.gnu.org; Fri, 08 Dec 2017 08:32:53 -0500 Received: from mail-it0-f46.google.com ([209.85.214.46]:41273) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eNIln-0007CR-OE for 24914@debbugs.gnu.org; Fri, 08 Dec 2017 08:32:51 -0500 Received: by mail-it0-f46.google.com with SMTP id x28so4882367ita.0 for <24914@debbugs.gnu.org>; Fri, 08 Dec 2017 05:32:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=qpYUC7GxKvlz/mcc3JK1F2XpcpHMl2X2mOx4Uxc+d20=; b=tlLSYlOoPSEvehHAiEjJtN0rIVS50QCGmqyVYqP7BmfvIYk1jhFeRdHsy8vOJQqHSn /GHXURQ3UVdBPHAYhtvMqsuMiCSEvaaC+UVSKXsqkhbGztaGORZEjYTwXxuytF72vyha MB7xsUdDpUvR5FT4+KAJGhD33UbjNJTqzjB0puWGgTIAEWe4FkYIshAHL+YXgXICA4xF lLOKieWuS+VAD0rpCgRhfa7p4dfmZWOtR0MWUQHln2o5UP/Bu7S1nRp44DahQ74yjWO+ wD4wACSMh2E3GhR6NJiRcPuj+aqI4KeEFOjwOP/+WtrN408Wy8312kydm1yRsxh7vLcn fH8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:references:date :in-reply-to:message-id:user-agent:mime-version; bh=qpYUC7GxKvlz/mcc3JK1F2XpcpHMl2X2mOx4Uxc+d20=; b=KsQS1/GST2Q5i9kZX+MiyLTB2JaGTs4tpMfcHSuFzZidPff0dXSxUqNd7x91XCf9EF lXB0PUg1tB128waWnd/qLaNYuWjtSdRjyTeSl0PAVJmk3l2ssr36xaGYfJCXXXa1bCTv p/EB0fPufBy8gTozw4TuKkcLSOfOBCkd75UU+8zHo0bHIOqQ3RelHzYH86Z9zxxvD5Nh cGamFzOjPloHMzk2S82fiIrubaR/bfy45o9BQKQwpJVpHQEM60u0fvRGSWXQc/yEJ5Ih AfzJV18xYQcb5pj6raH2evGBqOiWR0WZ7/wPNPCMe3NmkpgTjfPapetKvb9VKNz8QzRj Vlsg== X-Gm-Message-State: AKGB3mLgr9pVn30qE8H/uO3rnEhA3qreX0qh1voKGR0zLd/cdaF9I6Jc LfHifTKTOQabGp9pPhfMBsXQqA== X-Google-Smtp-Source: ACJfBov5kSSlp157FAUUcvCjWp3pNHA5m1qH2xxBNp5cQfwbEgDp8PLQFdyExIcLiu0Za7UzreFVsA== X-Received: by 10.107.143.198 with SMTP id r189mr2740927iod.45.1512739965619; Fri, 08 Dec 2017 05:32:45 -0800 (PST) Received: from zebian ([45.2.119.34]) by smtp.googlemail.com with ESMTPSA id 131sm133409itf.25.2017.12.08.05.32.43 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 Dec 2017 05:32:44 -0800 (PST) From: Noam Postavsky To: Eli Zaretskii Subject: Re: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> <87shcrgg8g.fsf@users.sourceforge.net> <87h8t6gegl.fsf@users.sourceforge.net> <83wp1xwnx1.fsf@gnu.org> Date: Fri, 08 Dec 2017 08:32:42 -0500 In-Reply-To: <83wp1xwnx1.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 08 Dec 2017 11:48:26 +0200") Message-ID: <874lp1fipx.fsf@users.sourceforge.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.90 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 24914 Cc: drew.adams@oracle.com, 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.5 (/) Eli Zaretskii writes: >> From: Noam Postavsky >> Date: Mon, 04 Dec 2017 20:18:02 -0500 >> Cc: 24914@debbugs.gnu.org >> >> > Emacs never did (the MS-DOS port of Emacs runs in i386 32-bit >> > protected mode on top of a 16-bit OS). But regex.c did, at some very >> > distant past, to support the 16-bit MS compiler, or at least it tried >> > to. >> >> So changing to 2^31 as the max should be fine, right? > > Or maybe just INT_MAX? I thought it would be easier to document the limit if it's fixed across all machines. Otherwise we would have to say something like "For both forms, m and n, if specified, may be no larger than INT_MAX, which is usually 2**31 - 1, but could be 2**63 - 1 depending on the compiler used for building Emacs". From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 08 09:35:40 2017 Received: (at 24914) by debbugs.gnu.org; 8 Dec 2017 14:35:40 +0000 Received: from localhost ([127.0.0.1]:51553 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eNJkZ-0000E7-QO for submit@debbugs.gnu.org; Fri, 08 Dec 2017 09:35:40 -0500 Received: from eggs.gnu.org ([208.118.235.92]:54614) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eNJkW-0000Dt-Ua for 24914@debbugs.gnu.org; Fri, 08 Dec 2017 09:35:38 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eNJkO-0001k1-N2 for 24914@debbugs.gnu.org; Fri, 08 Dec 2017 09:35:31 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,T_RP_MATCHES_RCVD, URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:39610) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eNJkO-0001jn-K6; Fri, 08 Dec 2017 09:35:28 -0500 Received: from [176.228.60.248] (port=3178 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1eNJkN-0006ti-UQ; Fri, 08 Dec 2017 09:35:28 -0500 Date: Fri, 08 Dec 2017 16:35:07 +0200 Message-Id: <83bmj9wan8.fsf@gnu.org> From: Eli Zaretskii To: Noam Postavsky In-reply-to: <874lp1fipx.fsf@users.sourceforge.net> (message from Noam Postavsky on Fri, 08 Dec 2017 08:32:42 -0500) Subject: Re: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> <87shcrgg8g.fsf@users.sourceforge.net> <87h8t6gegl.fsf@users.sourceforge.net> <83wp1xwnx1.fsf@gnu.org> <874lp1fipx.fsf@users.sourceforge.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 24914 Cc: drew.adams@oracle.com, 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > From: Noam Postavsky > Cc: drew.adams@oracle.com, 24914@debbugs.gnu.org > Date: Fri, 08 Dec 2017 08:32:42 -0500 > > >> So changing to 2^31 as the max should be fine, right? > > > > Or maybe just INT_MAX? > > I thought it would be easier to document the limit if it's fixed across > all machines. Otherwise we would have to say something like "For both > forms, m and n, if specified, may be no larger than INT_MAX, which is > usually 2**31 - 1, but could be 2**63 - 1 depending on the compiler used > for building Emacs". Isn't int 32 bit wide everywhere? And anyway, since the bitmap is stored in an int, isn't INT_MAX TRT? But I'm not a language expert enough to argue; feel free to use your form if you think it's better. From debbugs-submit-bounces@debbugs.gnu.org Sat Dec 09 21:18:17 2017 Received: (at 24914) by debbugs.gnu.org; 10 Dec 2017 02:18:17 +0000 Received: from localhost ([127.0.0.1]:54343 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eNrC5-0002Rq-4k for submit@debbugs.gnu.org; Sat, 09 Dec 2017 21:18:17 -0500 Received: from mail-io0-f171.google.com ([209.85.223.171]:38685) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eNrC3-0002Rd-66 for 24914@debbugs.gnu.org; Sat, 09 Dec 2017 21:18:15 -0500 Received: by mail-io0-f171.google.com with SMTP id d14so6200701ioc.5 for <24914@debbugs.gnu.org>; Sat, 09 Dec 2017 18:18:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=l1xLH31GuPN15Jo73QID0UjITDdb+CQvHraNvnaqyBo=; b=q9QFleR5yYAxvKYp1LniK/nPZXrepEbgjCAYItWNxpwg04YLrpW+n6kSJeZCjYMyE5 sJdxUZ9nB3jYKy5MUz51kJpHQqw9bnPS7eTJcSGeskB7VpDWjn7qqgm74LAk+lC/+Ywl 3BAKoDdbF3Idi8OnhlolDCXwzZXulMSNVAT35eKsf/7e7SV9MFDcDOWHv3IoaHkRs08K XCfA7v6siCXSUHDngNZpOZjnO6cpLhMHb1fTQKVhdxriH2+LBNqPH6BIoJ5K2eWvFmFp DB0bainRYJ5RNGh4CdMl1n93+YycydyNWo2j4fmEIbTbpVo4xCPxsFDZCh5U3Kje7jX1 CUEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:references:date :in-reply-to:message-id:user-agent:mime-version; bh=l1xLH31GuPN15Jo73QID0UjITDdb+CQvHraNvnaqyBo=; b=NO7z3L8U1u1PaW92GjxgG2ogoaxW3hPqM28/exHqtOaCYUadnws7u0hHz/JZavQnVr 63DV6xYtHBcnXLlO6KHWG/EbA1gFw6qF2M1YJWxVM47nqkkcRnSvGF51cEJS9vLrWy9o r5BhHJBej4/GcCUTV4xpO0gFecyDboFprsgrHpgjIMTJ0rcwxPtC9AvzpyR+Y9FP4FT0 Ssm8pRMf8Ec/YQoQ0juMaHccJBEt29aDzwc+Cb3KnC2R0nPQBaG+tBX/WbXShgVSKvHH iN+4nJuvJrAghM504dO1e3csPoSVCFCX/ypkNAJc7B4BQ56pBVh/i2VL4EOj0zQik13E PEVg== X-Gm-Message-State: AKGB3mKwWjncbe+vVr2xZd4L0TJNFCNcGuIH0ERYnQDBIQaelcGoT+wl 42GA+msAEzpIe9H0e6C3b95Ldw== X-Google-Smtp-Source: ACJfBosSxMNNTvbXMaGG26tjm9eTH3m0CfNLQa8UmNiViOOy3BxZ/d2D+u3HsZba58h86R4KNZj3SQ== X-Received: by 10.107.163.14 with SMTP id m14mr10198322ioe.73.1512872289259; Sat, 09 Dec 2017 18:18:09 -0800 (PST) Received: from zebian ([45.2.119.34]) by smtp.googlemail.com with ESMTPSA id i133sm2549690itf.1.2017.12.09.18.18.06 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 09 Dec 2017 18:18:07 -0800 (PST) From: Noam Postavsky To: Eli Zaretskii Subject: Re: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> <87shcrgg8g.fsf@users.sourceforge.net> <87h8t6gegl.fsf@users.sourceforge.net> <83wp1xwnx1.fsf@gnu.org> <874lp1fipx.fsf@users.sourceforge.net> <83bmj9wan8.fsf@gnu.org> Date: Sat, 09 Dec 2017 21:18:05 -0500 In-Reply-To: <83bmj9wan8.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 08 Dec 2017 16:35:07 +0200") Message-ID: <87vahfe36q.fsf@users.sourceforge.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.90 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 24914 Cc: drew.adams@oracle.com, 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.5 (/) --=-=-= Content-Type: text/plain Eli Zaretskii writes: >> I thought it would be easier to document the limit if it's fixed across >> all machines. Otherwise we would have to say something like "For both >> forms, m and n, if specified, may be no larger than INT_MAX, which is >> usually 2**31 - 1, but could be 2**63 - 1 depending on the compiler used >> for building Emacs". > > Isn't int 32 bit wide everywhere? I might have been mixing up int with long when I was thinking about this; it seems only a few very obscure platforms have 64 bit ints. According to [1], everywhere but "HAL Computer Systems port of Solaris to the SPARC64" and "Classic UNICOS" has 32 bit ints. [1]: https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models > And anyway, since the bitmap is stored in an int, isn't INT_MAX TRT? Unfortunately, all this discussion of int size seems to be academic. I took another look at the code, there is another limit due to regexp opcode format. We can raise the limit to 2^16-1 though. Here is the use of RE_DUP_MAX, which makes it seem like int-size is the main limit: /* Get the next unsigned number in the uncompiled pattern. */ #define GET_INTERVAL_COUNT(num) \ ... if (RE_DUP_MAX / 10 - (RE_DUP_MAX % 10 < c - '0') < num) \ FREE_STACK_RETURN (REG_ESIZEBR); \ static reg_errcode_t regex_compile (const_re_char *pattern, size_t size, { ... int lower_bound = 0, upper_bound = -1; [...] GET_INTERVAL_COUNT (lower_bound); But then INSERT_JUMP2 (succeed_n, laststart, b + 5 + nbytes, lower_bound); /* Like `STORE_JUMP2', but for inserting. Assume `b' is the buffer end. */ #define INSERT_JUMP2(op, loc, to, arg) \ insert_op2 (op, loc, (to) - (loc) - 3, arg, b) /* Like `insert_op1', but for two two-byte parameters ARG1 and ARG2. */ ^^^^^^^^ static void insert_op2 (re_opcode_t op, unsigned char *loc, int arg1, int arg2, unsigned char *end) { ... store_op2 (op, loc, arg1, arg2); } /* Like `store_op1', but for two two-byte parameters ARG1 and ARG2. */ ^^^^^^^^ static void store_op2 (re_opcode_t op, unsigned char *loc, int arg1, int arg2) { *loc = (unsigned char) op; STORE_NUMBER (loc + 1, arg1); STORE_NUMBER (loc + 3, arg2); } /* Store NUMBER in two contiguous bytes starting at DESTINATION. */ ^^^^^^^^^^^^^^^^^^^^ #define STORE_NUMBER(destination, number) \ do { \ (destination)[0] = (number) & 0377; \ (destination)[1] = (number) >> 8; \ } while (0) Here is the updated patch: --=-=-= Content-Type: text/plain Content-Disposition: attachment; filename=0001-Raise-limit-of-regexp-repetition-Bug-24914.patch Content-Description: patch >From 6c3ead6bd5c61801915dcedbb8dd17622610a899 Mon Sep 17 00:00:00 2001 From: Noam Postavsky Date: Sat, 2 Dec 2017 19:01:54 -0500 Subject: [PATCH] Raise limit of regexp repetition (Bug#24914) * src/regex.h (RE_DUP_MAX): Raise limit to 2^16-1. * etc/NEWS: Announce it. * doc/lispref/searching.texi (Regexp Backslash): Document it. * test/src/regex-tests.el (regex-repeat-limit): Test it. * src/regex.h (reg_errcode_t): Add REG_ESIZEBR code. * src/regex.c (re_error_msgid): Add corresponding entry. (GET_INTERVAL_COUNT): Return it instead of the more generic REG_EBADBR when encountering a repetition greater than RE_DUP_MAX. * lisp/isearch.el (isearch-search): Don't convert errors starting with "Invalid" into "incomplete". Such errors are not incomplete, in the sense that they cannot be corrected by appending more characters to the end of the regexp. The affected error messages are: - REG_BADPAT "Invalid regular expression" - \\(?X:\\) where X is not a legal group number - \\_X where X is not < or > - REG_ECOLLATE "Invalid collation character" - There is no code to throw this. - REG_ECTYPE "Invalid character class name" - [[:foo:] where foo is not a valid class name - REG_ESUBREG "Invalid back reference" - \N where N is referenced before matching group N - REG_BADBR "Invalid content of \\{\\}" - \\{N,M\\} where N < 0, M < N, M or N larger than max - \\{NX where X is not a digit or backslash - \\{N\\X where X is not a } - REG_ERANGE "Invalid range end" - There is no code to throw this. - REG_BADRPT "Invalid preceding regular expression" - We never throw this. It would usually indicate a "*" with no preceding regexp text, but Emacs allows that to match a literal "*". --- doc/lispref/searching.texi | 10 +++++++++- etc/NEWS | 8 ++++++++ lisp/isearch.el | 2 +- src/regex.c | 5 +++-- src/regex.h | 9 ++++++--- test/src/regex-tests.el | 6 ++++++ 6 files changed, 33 insertions(+), 7 deletions(-) diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi index 755fa554bb..ab52cf2802 100644 --- a/doc/lispref/searching.texi +++ b/doc/lispref/searching.texi @@ -639,7 +639,15 @@ Regexp Backslash is a more general postfix operator that specifies repetition with a minimum of @var{m} repeats and a maximum of @var{n} repeats. If @var{m} is omitted, the minimum is 0; if @var{n} is omitted, there is no -maximum. +maximum. For both forms, @var{m} and @var{n}, if specified, may be no +larger than +@ifnottex +2**16 @minus{} 1 +@end ifnottex +@tex +@math{2^{16}-1} +@end tex +. For example, @samp{c[ad]\@{1,2\@}r} matches the strings @samp{car}, @samp{cdr}, @samp{caar}, @samp{cadr}, @samp{cdar}, and @samp{cddr}, and diff --git a/etc/NEWS b/etc/NEWS index 64b53d88c8..c7efc53f6a 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -509,6 +509,14 @@ instead. ** The new user option 'arabic-shaper-ZWNJ-handling' controls how to handle ZWNJ in Arabic text rendering. ++++ +** The limit on repetitions in regexps has been raised to 2^16-1. +It was previously undocumented and limited to 2^15-1. For example, +the following regular expression was previously invalid, but is now +accepted: + + x\{32768\} + * Editing Changes in Emacs 26.1 diff --git a/lisp/isearch.el b/lisp/isearch.el index 13fa97ea71..093185a096 100644 --- a/lisp/isearch.el +++ b/lisp/isearch.el @@ -2851,7 +2851,7 @@ isearch-search (setq isearch-error (car (cdr lossage))) (cond ((string-match - "\\`Premature \\|\\`Unmatched \\|\\`Invalid " + "\\`Premature \\|\\`Unmatched " isearch-error) (setq isearch-error "incomplete input")) ((and (not isearch-regexp) diff --git a/src/regex.c b/src/regex.c index 330f2f78a8..ab74f457d4 100644 --- a/src/regex.c +++ b/src/regex.c @@ -1200,7 +1200,8 @@ WEAK_ALIAS (__re_set_syntax, re_set_syntax) gettext_noop ("Premature end of regular expression"), /* REG_EEND */ gettext_noop ("Regular expression too big"), /* REG_ESIZE */ gettext_noop ("Unmatched ) or \\)"), /* REG_ERPAREN */ - gettext_noop ("Range striding over charsets") /* REG_ERANGEX */ + gettext_noop ("Range striding over charsets"), /* REG_ERANGEX */ + gettext_noop ("Invalid content of \\{\\}, repetitions too big") /* REG_ESIZEBR */ }; /* Whether to allocate memory during matching. */ @@ -1921,7 +1922,7 @@ while (REMAINING_AVAIL_SLOTS <= space) { \ if (num < 0) \ num = 0; \ if (RE_DUP_MAX / 10 - (RE_DUP_MAX % 10 < c - '0') < num) \ - FREE_STACK_RETURN (REG_BADBR); \ + FREE_STACK_RETURN (REG_ESIZEBR); \ num = num * 10 + c - '0'; \ if (p == pend) \ FREE_STACK_RETURN (REG_EBRACE); \ diff --git a/src/regex.h b/src/regex.h index 9fa8356011..4c8632d6aa 100644 --- a/src/regex.h +++ b/src/regex.h @@ -270,8 +270,10 @@ #ifdef RE_DUP_MAX # undef RE_DUP_MAX #endif -/* If sizeof(int) == 2, then ((1 << 15) - 1) overflows. */ -#define RE_DUP_MAX (0x7fff) +/* Repeat counts are stored in opcodes as 2 byte integers. This was + previously limited to 7fff because the parsing code uses signed + ints. But Emacs only runs on 32 bit platforms anyway. */ +#define RE_DUP_MAX (0xffff) /* POSIX `cflags' bits (i.e., information for `regcomp'). */ @@ -337,7 +339,8 @@ REG_EEND, /* Premature end. */ REG_ESIZE, /* Compiled pattern bigger than 2^16 bytes. */ REG_ERPAREN, /* Unmatched ) or \); not returned from regcomp. */ - REG_ERANGEX /* Range striding over charsets. */ + REG_ERANGEX, /* Range striding over charsets. */ + REG_ESIZEBR /* n or m too big in \{n,m\} */ } reg_errcode_t; /* This data structure represents a compiled pattern. Before calling diff --git a/test/src/regex-tests.el b/test/src/regex-tests.el index b1f1ea71ce..872d16a085 100644 --- a/test/src/regex-tests.el +++ b/test/src/regex-tests.el @@ -677,4 +677,10 @@ regex-tests-TESTS This evaluates the TESTS test cases from glibc." (should-not (regex-tests-TESTS))) +(ert-deftest regex-repeat-limit () + "Test the #xFFFF repeat limit." + (should (string-match "\\`x\\{65535\\}" (make-string 65535 ?x))) + (should-not (string-match "\\`x\\{65535\\}" (make-string 65534 ?x))) + (should-error (string-match "\\`x\\{65536\\}" "X") :type invalid-regexp)) + ;;; regex-tests.el ends here -- 2.11.0 --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 10 01:50:13 2017 Received: (at 24914) by debbugs.gnu.org; 10 Dec 2017 06:50:13 +0000 Received: from localhost ([127.0.0.1]:54416 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eNvRF-0000Io-FC for submit@debbugs.gnu.org; Sun, 10 Dec 2017 01:50:13 -0500 Received: from eggs.gnu.org ([208.118.235.92]:42521) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eNvRD-0000Ib-HA for 24914@debbugs.gnu.org; Sun, 10 Dec 2017 01:50:11 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eNvR3-000799-Sv for 24914@debbugs.gnu.org; Sun, 10 Dec 2017 01:50:06 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:43697) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eNvR3-00078t-PI; Sun, 10 Dec 2017 01:50:01 -0500 Received: from [176.228.60.248] (port=1767 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1eNvR3-0004b1-6b; Sun, 10 Dec 2017 01:50:01 -0500 Date: Sun, 10 Dec 2017 08:49:44 +0200 Message-Id: <83r2s3t6uv.fsf@gnu.org> From: Eli Zaretskii To: Noam Postavsky In-reply-to: <87vahfe36q.fsf@users.sourceforge.net> (message from Noam Postavsky on Sat, 09 Dec 2017 21:18:05 -0500) Subject: Re: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> <87shcrgg8g.fsf@users.sourceforge.net> <87h8t6gegl.fsf@users.sourceforge.net> <83wp1xwnx1.fsf@gnu.org> <874lp1fipx.fsf@users.sourceforge.net> <83bmj9wan8.fsf@gnu.org> <87vahfe36q.fsf@users.sourceforge.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 24914 Cc: drew.adams@oracle.com, 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > From: Noam Postavsky > Cc: drew.adams@oracle.com, 24914@debbugs.gnu.org > Date: Sat, 09 Dec 2017 21:18:05 -0500 > > Unfortunately, all this discussion of int size seems to be academic. I > took another look at the code, there is another limit due to regexp > opcode format. We can raise the limit to 2^16-1 though. > [...] > Here is the updated patch: LGTM for master. Thanks for the research and for the patch. From debbugs-submit-bounces@debbugs.gnu.org Fri Jan 26 21:05:41 2018 Received: (at 24914) by debbugs.gnu.org; 27 Jan 2018 02:05:41 +0000 Received: from localhost ([127.0.0.1]:44181 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1efFsD-0003Jf-C1 for submit@debbugs.gnu.org; Fri, 26 Jan 2018 21:05:41 -0500 Received: from mail-it0-f45.google.com ([209.85.214.45]:33176) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1efFsB-0003JM-Mr; Fri, 26 Jan 2018 21:05:40 -0500 Received: by mail-it0-f45.google.com with SMTP id c102so25896519itd.0; Fri, 26 Jan 2018 18:05:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=Q8Z3HgTnuigBhQIJ9D1WueU5sbb/NtaPYRPlTI+5ahs=; b=PXJgtgn2GiBkSI87atawiTdmw6e7W3z4iDALxY2U8VOp+mkEihKU5mLYAGKXJP6ZKd dAIHH8PW6/F6CAXw4abUYLWgLBGsJ69LsbalnwzWqfBXBIjpp1MyyK+jk2arnIufWWIO /tkSczvDYv0NfImtI/W9kxl+ztFjOAPfs5J9DR3irBRfN+q0vmI6moQQkL/ZFOn+24l2 sY4kfZXQCcLE2lMi2p9ZiFG8n6RseU5EaDsgEW9X5D6TEYrCO9jPteB9lr4l/e0UTAZj E9bT8rUY3QoIzRfy6WZWKbTTOTMVZk9JwIgvZqxCWBJZ5GDGZXLxPR9YnWqSU03MYdnx Gc2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:references:date :in-reply-to:message-id:user-agent:mime-version; bh=Q8Z3HgTnuigBhQIJ9D1WueU5sbb/NtaPYRPlTI+5ahs=; b=dVtfPk1czlazbQC49PNbV4Ot5NQ7wPoPPNcWcKTA6MXC/ZD1Qtz8/9+TA7efZq9Yjl /ZxobYIIlC4ve3aTySUoEbNS+PBDJRkDh9iCNBfUYVoLVQRFIdCc546rnBNFJzmux+Zs MSD5trUGCpys6KL/2xQYLIWin9quCLV6Z1XzqH4BLopFJuZTOdflGf/kWu0pb7qiW0ap 2aNkRPibwSYsh++b3/l05DLqUPt+ruva9R8dOr87xyJyqadTNepKr1R1QU0Hgsr04pqY x7a99fkG6qdfQzL07xu9id/PC3rXtc0O+ohOVI/GJfcwVKFdX1t3brVjKGQTOeQ9Xt2n JuqQ== X-Gm-Message-State: AKwxytcE7pjigRTE+wIsy9c4eQZCRA3MCutfcBEjA7uzfjlFGqiqmOTw R3Fe3TVKinI6XtS4SbOTzLPTHg== X-Google-Smtp-Source: AH8x225nf8sJK3L7KIdxf2c7AJU/F75RsAKa+R+QzjXvV3GNAI5bV7wfIPaMUC0Jxi8ki3DCydGAEw== X-Received: by 10.36.40.141 with SMTP id h135mr18735350ith.77.1517018734180; Fri, 26 Jan 2018 18:05:34 -0800 (PST) Received: from zebian ([45.2.119.34]) by smtp.googlemail.com with ESMTPSA id j77sm1244788iod.47.2018.01.26.18.05.32 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 26 Jan 2018 18:05:33 -0800 (PST) From: Noam Postavsky To: Eli Zaretskii Subject: Re: bug#24914: 24.5; isearch-regexp: wrong error message References: <7c208ac0-8aa2-4db8-a38d-760f91c50500@default> <87h8t7ix7m.fsf@users.sourceforge.net> <87d13visrh.fsf@users.sourceforge.net> <87shcrgg8g.fsf@users.sourceforge.net> <87h8t6gegl.fsf@users.sourceforge.net> <83wp1xwnx1.fsf@gnu.org> <874lp1fipx.fsf@users.sourceforge.net> <83bmj9wan8.fsf@gnu.org> <87vahfe36q.fsf@users.sourceforge.net> <83r2s3t6uv.fsf@gnu.org> Date: Fri, 26 Jan 2018 21:05:31 -0500 In-Reply-To: <83r2s3t6uv.fsf@gnu.org> (Eli Zaretskii's message of "Sun, 10 Dec 2017 08:49:44 +0200") Message-ID: <87372sm4yc.fsf@users.sourceforge.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.90 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.2 (/) X-Debbugs-Envelope-To: 24914 Cc: drew.adams@oracle.com, 24914@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.2 (/) tags 24914 fixed close 24914 27.1 quit Eli Zaretskii writes: >> From: Noam Postavsky >> Cc: drew.adams@oracle.com, 24914@debbugs.gnu.org >> Date: Sat, 09 Dec 2017 21:18:05 -0500 >> >> Unfortunately, all this discussion of int size seems to be academic. I >> took another look at the code, there is another limit due to regexp >> opcode format. We can raise the limit to 2^16-1 though. >> [...] >> Here is the updated patch: > > LGTM for master. Thanks for the research and for the patch. Pushed to master [1: 559f160616], also documented limit in emacs-26 [2: 463f96b481]. [1: 559f160616]: 2018-01-26 20:49:44 -0500 Raise limit of regexp repetition (Bug#24914) https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=559f1606166822394df3988c18c0ad02984ac675 [2: 463f96b481]: 2018-01-26 19:53:09 -0500 * doc/lispref/searching.texi: Document regexp repetition limit. https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=463f96b4813fb77d88a7b0fa93f94aa08d71689f From unknown Thu Aug 14 22:19:44 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sat, 24 Feb 2018 12:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator