From debbugs-submit-bounces@debbugs.gnu.org Tue Apr 29 15:19:40 2014 Received: (at submit) by debbugs.gnu.org; 29 Apr 2014 19:19:40 +0000 Received: from localhost ([127.0.0.1]:45404 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WfDZ6-0007t3-5r for submit@debbugs.gnu.org; Tue, 29 Apr 2014 15:19:40 -0400 Received: from eggs.gnu.org ([208.118.235.92]:47347) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WfDZ4-0007sr-0p for submit@debbugs.gnu.org; Tue, 29 Apr 2014 15:19:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WfDYr-0004NW-By for submit@debbugs.gnu.org; Tue, 29 Apr 2014 15:19:32 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:52629) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WfDYr-0004NS-9s for submit@debbugs.gnu.org; Tue, 29 Apr 2014 15:19:25 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48992) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WfDYl-0006RG-1y for bug-gnu-emacs@gnu.org; Tue, 29 Apr 2014 15:19:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WfDYe-0004Jq-PP for bug-gnu-emacs@gnu.org; Tue, 29 Apr 2014 15:19:18 -0400 Received: from mailrelay006.isp.belgacom.be ([195.238.6.172]:55562) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WfDYe-0004Jj-GB for bug-gnu-emacs@gnu.org; Tue, 29 Apr 2014 15:19:12 -0400 X-Belgacom-Dynamic: yes Received: from 41.233-178-91.adsl-dyn.isp.belgacom.be (HELO LDLC-portable) ([91.178.233.41]) by relay.skynet.be with ESMTP; 29 Apr 2014 21:19:10 +0200 From: Nicolas Richard To: bug-gnu-emacs@gnu.org Subject: 24.3.50; match data is incorrect if there are too many groups Date: Tue, 29 Apr 2014 21:19:11 +0200 Message-ID: <87ppk0hrkg.fsf@yahoo.fr> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) Hi, The following reports 2. Replace 255 by 254, and it'll report 512 as expected #+BEGIN_SRC emacs-lisp (with-temp-buffer (insert "bar") (when (re-search-backward (concat (mapconcat (lambda (x) (format "\\(%s\\)" x)) (make-list 255 "foo") "\\|") "\\|" "\\(bar\\)") nil t) (length (match-data)))) #+END_SRC Regexps with many groups is the kind of thing is used in AUCTeX, in TeX-auto-parse-region. What auctex does in that function is construct a big regexp out of a list of smaller ones (each small one is made into a group) ; then when the big regexp matches it then tries to find out which of the smaller regexps actually matched by checking which group is non-nil. In GNU Emacs 24.3.50.7 (i686-pc-linux-gnu, GTK+ Version 2.24.20) of 2014-04-10 on LDLC-portable Windowing system distributor `The X.Org Foundation', version 11.0.11405000 System Description: Ubuntu 13.10 Configured using: `configure 'CFLAGS=-g3 -O2'' Important settings: value of $LANG: fr_BE.UTF-8 locale-coding-system: utf-8-unix -- Nico. From debbugs-submit-bounces@debbugs.gnu.org Mon May 19 01:47:51 2014 Received: (at 17373) by debbugs.gnu.org; 19 May 2014 05:47:51 +0000 Received: from localhost ([127.0.0.1]:52820 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WmGQR-00034E-1Q for submit@debbugs.gnu.org; Mon, 19 May 2014 01:47:51 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:58369) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WmGQP-00033u-BV for 17373@debbugs.gnu.org; Mon, 19 May 2014 01:47:50 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 6C5CC39E807B for <17373@debbugs.gnu.org>; Sun, 18 May 2014 22:47:42 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id V4Kr0ESndZvq for <17373@debbugs.gnu.org>; Sun, 18 May 2014 22:47:33 -0700 (PDT) Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net [108.0.233.62]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id ABB7139E801D for <17373@debbugs.gnu.org>; Sun, 18 May 2014 22:47:33 -0700 (PDT) Message-ID: <53799AF5.9090708@cs.ucla.edu> Date: Sun, 18 May 2014 22:47:33 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: 17373@debbugs.gnu.org Subject: Re: 24.3.50; match data is incorrect if there are too many groups Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -3.0 (---) X-Debbugs-Envelope-To: 17373 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) Yes, unfortunately Emacs currently has a limit of at most 256 groups of match data: one for the entire pattern, and 255 for parenthesized subpatterns. If you go over the limit, the excess matches are silently discarded. I don't see this limitation documented anywhere; it should be. Or better yet, the limitation should be removed. The limitation is wired into the representation of the 'start_memory' code in compiled regular expressions: this code has a one-byte operand. As far as I know, the limitation is specific to Emacs, and is not present in the Gnulib or glibc versions of the regexp matcher. From debbugs-submit-bounces@debbugs.gnu.org Mon May 19 09:48:37 2014 Received: (at 17373) by debbugs.gnu.org; 19 May 2014 13:48:37 +0000 Received: from localhost ([127.0.0.1]:52968 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WmNvh-0002RJ-1G for submit@debbugs.gnu.org; Mon, 19 May 2014 09:48:37 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:17978) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1WmNve-0002R1-NZ for 17373@debbugs.gnu.org; Mon, 19 May 2014 09:48:35 -0400 Received: from acsinet22.oracle.com (acsinet22.oracle.com [141.146.126.238]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id s4JDmKc6031194 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 19 May 2014 13:48:21 GMT Received: from userz7022.oracle.com (userz7022.oracle.com [156.151.31.86]) by acsinet22.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id s4JDmI14001371 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 19 May 2014 13:48:20 GMT Received: from abhmp0014.oracle.com (abhmp0014.oracle.com [141.146.116.20]) by userz7022.oracle.com (8.14.5+Sun/8.14.4) with ESMTP id s4JDmGIJ016865; Mon, 19 May 2014 13:48:17 GMT MIME-Version: 1.0 Message-ID: <3dc9fa47-c3d8-40e2-b6e4-3f362a0c1b6e@default> Date: Mon, 19 May 2014 06:48:16 -0700 (PDT) From: Drew Adams To: Paul Eggert , 17373@debbugs.gnu.org Subject: RE: bug#17373: 24.3.50; match data is incorrect if there are too many groups References: <87ppk0hrkg.fsf@yahoo.fr> <53799AF5.9090708@cs.ucla.edu> In-Reply-To: <53799AF5.9090708@cs.ucla.edu> X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.8 (707110) [OL 12.0.6691.5000 (x86)] Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Source-IP: acsinet22.oracle.com [141.146.126.238] X-Spam-Score: -3.0 (---) X-Debbugs-Envelope-To: 17373 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.0 (---) > Yes, unfortunately Emacs currently has a limit of at most 256 groups of > match data: one for the entire pattern, and 255 for parenthesized > subpatterns. If you go over the limit, the excess matches are silently > discarded. I don't see this limitation documented anywhere; it should > be. Or better yet, the limitation should be removed. Good to know. +1, to documenting it, at least. From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 10 12:11:59 2016 Received: (at 17373) by debbugs.gnu.org; 10 Feb 2016 17:11:59 +0000 Received: from localhost ([127.0.0.1]:35215 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aTYJ5-0004oZ-JU for submit@debbugs.gnu.org; Wed, 10 Feb 2016 12:11:59 -0500 Received: from msg.wmi.amu.edu.pl ([150.254.78.50]:40634) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aTYJ3-0004oQ-Vi for 17373@debbugs.gnu.org; Wed, 10 Feb 2016 12:11:58 -0500 Received: from localhost (localhost [127.0.0.1]) by msg.wmi.amu.edu.pl (Postfix) with ESMTP id 41BA87C964; Wed, 10 Feb 2016 18:11:56 +0100 (CET) Received: from msg.wmi.amu.edu.pl ([127.0.0.1]) by localhost (msg.wmi.amu.edu.pl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id usnFlpKAyCE8; Wed, 10 Feb 2016 18:11:56 +0100 (CET) Received: from localhost (unknown [109.232.24.28]) by msg.wmi.amu.edu.pl (Postfix) with ESMTPSA id E1F847C940; Wed, 10 Feb 2016 18:11:55 +0100 (CET) From: Marcin Borkowski To: 17373@debbugs.gnu.org Subject: Re: bug#17373: 24.3.50; match data is incorrect if there are too many groups References: <87ppk0hrkg.fsf@yahoo.fr> <53799AF5.9090708@cs.ucla.edu> <3dc9fa47-c3d8-40e2-b6e4-3f362a0c1b6e@default> Date: Wed, 10 Feb 2016 18:11:54 +0100 In-Reply-To: <3dc9fa47-c3d8-40e2-b6e4-3f362a0c1b6e@default> (Drew Adams's message of "Mon, 19 May 2014 06:48:16 -0700 (PDT)") Message-ID: <8737t0qylh.fsf@wmi.amu.edu.pl> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -0.2 (/) X-Debbugs-Envelope-To: 17373 Cc: Paul Eggert , Drew Adams X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.2 (/) On 2014-05-19, at 07:48, Drew Adams wrote: >> Yes, unfortunately Emacs currently has a limit of at most 256 groups of >> match data: one for the entire pattern, and 255 for parenthesized >> subpatterns. If you go over the limit, the excess matches are silently >> discarded. I don't see this limitation documented anywhere; it should >> be. Or better yet, the limitation should be removed. > > Good to know. +1, to documenting it, at least. I can write a patch to the manual, but I'm a bit afraid that if this gets documented, the limit will stay there forever. Is there a chance of someone fluent in C to fix this? (Incidentally, I have one package of mine where this limit could strike, too.) Best, -- Marcin Borkowski From debbugs-submit-bounces@debbugs.gnu.org Sat Jun 04 18:47:44 2016 Received: (at control) by debbugs.gnu.org; 4 Jun 2016 22:47:44 +0000 Received: from localhost ([127.0.0.1]:54708 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b9KM4-0000cc-1U for submit@debbugs.gnu.org; Sat, 04 Jun 2016 18:47:44 -0400 Received: from mail-oi0-f42.google.com ([209.85.218.42]:35634) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b9KM3-0000cP-9R for control@debbugs.gnu.org; Sat, 04 Jun 2016 18:47:43 -0400 Received: by mail-oi0-f42.google.com with SMTP id w184so177259050oiw.2 for ; Sat, 04 Jun 2016 15:47:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:from:date:message-id:subject:to; bh=dRXGjMix62ildB61t15XhkVbDV0QSrNHo3Mqqkt0YeY=; b=NGzqL6nKdZm60KSa4NlkBjybqyxYsUrxZPmeWukBmUuZTAQDgzZXyILQPU3n8PP85M 3SQ3BO24GVRx09bt1Mbq4WaDuJjptRMoAif4rsJ372POf8tz3gh9ZseWWBcWWMwaaQji V2VurfS3QGEStpYJaB8tE767oNF/H3z4D9NyWvs9UlW63QUdMPdx0Zgbk3TxTeVqw3zz +Tru36QUkrdb1kOgykK/QT8OjVEtTVnJQkANRqWXvT5x4/tLNapzxYZeiWB8PUu5w2Qd crSepjF31LzeCA6Z2yOvX1UuCRP/jH3vBUJgVAuhGA1JtT7fDge2lfTDWAazIjtcXmH9 n3Eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:from:date:message-id:subject :to; bh=dRXGjMix62ildB61t15XhkVbDV0QSrNHo3Mqqkt0YeY=; b=T77EQcGth3Q6kfbWJbkeU/R1TcyX+5h+/YtSsdvyfKHrtpByt61Dj0zTH86ADwNIEo JVSxeeHwMP2A3m6aoe8Mswk0x9vy2ADfVuJsYBJkjPpyl49JIFZfz4riR3in4UczvQP0 q2a9ywxXp+QHA/2fovsOyg3YAyMk1gdqOokGrwczBWqKkGVTdjvgDkneDpPe0ChEtjJw 1z+JNhspIL0VYoqxxoAmLdqDerm5kOzPn4TwXrL16vZnCefRaewVgH/3Og4Pqk9U/Knx tcgnHwQxqsZjL+CrYqck40BxetuTjBzmNxegbUIGmH3W5xZ0RUt424RT5sY4LZpkxCmd JDJA== X-Gm-Message-State: ALyK8tI3ZNsx/sQefKJ0o6lBGMit+P26R9UDFYYeOwsGD5Q0FREdz21m3LHcOhYLFeQ3X/5QTeJzXJPdC9iuVQ== X-Received: by 10.157.13.167 with SMTP id 36mr5575619ots.134.1465080457725; Sat, 04 Jun 2016 15:47:37 -0700 (PDT) MIME-Version: 1.0 Received: by 10.157.5.168 with HTTP; Sat, 4 Jun 2016 15:47:37 -0700 (PDT) From: Noam Postavsky Date: Sat, 4 Jun 2016 18:47:37 -0400 X-Google-Sender-Auth: Scza9X3iYl5mju4xGGVYh3dQaOI Message-ID: Subject: 24.3.50; match data is incorrect if there are too many groups To: control@debbugs.gnu.org Content-Type: text/plain; charset=UTF-8 X-Spam-Score: -0.5 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.5 (/) found 17373 25.0.94 tag + 17373 confirmed severity 17373 minor quit From debbugs-submit-bounces@debbugs.gnu.org Sat Jun 04 18:50:42 2016 Received: (at control) by debbugs.gnu.org; 4 Jun 2016 22:50:42 +0000 Received: from localhost ([127.0.0.1]:54713 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b9KOw-0000hE-Fi for submit@debbugs.gnu.org; Sat, 04 Jun 2016 18:50:42 -0400 Received: from mail-oi0-f41.google.com ([209.85.218.41]:34320) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1b9KOs-0000gx-E9 for control@debbugs.gnu.org; Sat, 04 Jun 2016 18:50:41 -0400 Received: by mail-oi0-f41.google.com with SMTP id e72so177484722oib.1 for ; Sat, 04 Jun 2016 15:50:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to; bh=KlSHgvCJxMPU3Rynmg2scrmQ9EqOp7LAM1+iIXwOOfc=; b=r+2tgyljpTSdwAR4f5irGXzcYggES29GD8bROjlZBAlreZaLdpxhFCc7N6lYXtNOcz ZepWDUd3e2rf8sN1FYzgKMEISx8glVpuuI4xbgzStZH4CuAcD1yPyC3eON9fJNMBxSEq mMYJo3xOvusfzJjJDQqQODoMR6+csLgvgLomJvw1tV5nnkSgWcO7Q98QQKMzNmU4c4xc DR3t/0Y4utF1zEKScHCU2q4tvJEz7zlm31zJGSemTy6ApmlsTSOMyJplDyyqgipf4NVr On0vaF+/XOGmy75hgMnqnSXrjo49ZRh4P///OY8qBPhdkD8wdozGvxu0PsQlF3nBrCoj te4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to; bh=KlSHgvCJxMPU3Rynmg2scrmQ9EqOp7LAM1+iIXwOOfc=; b=cmmAKOQVP/al4E73JRaJGELC3DAgNeRzJbd/dYFlaryH/n2yL6K2UBUEmYXiMly4RR VwbGSYCv265kVRxZP4PzRfXmmcO5j+PLbf4eTtPBRkBZDWoKVxbyfJ+/kGcplpOloLmf 7nDybWLA9o87spNWIgEoBIyjYq7CPNZwgscQ+jd0YB5aJw69z3xr/xM12/Va2n+u8DkD cLZO1oOQxe6tzFiFu4SItqXIRCR8+/VmoDw3ZNssyv17gLTeO9DErowW+Y2QO3kesmMy Zhn5nsXUSxSHtOQpRR+NYILoL+qYGz7Aixi90EUlCIvRwxlDsEtljueJR6idn7dJxdY9 77wA== X-Gm-Message-State: ALyK8tIlgK8q8pwXE7QMqU/i0mhumrWsC0eL1iZrgN94QsMqgKCii+6JDe/UZ0UMmOVvlMgzwUkvwsjY9IwaMA== X-Received: by 10.157.38.185 with SMTP id l54mr5672283otb.112.1465080633017; Sat, 04 Jun 2016 15:50:33 -0700 (PDT) MIME-Version: 1.0 Received: by 10.157.5.168 with HTTP; Sat, 4 Jun 2016 15:50:32 -0700 (PDT) In-Reply-To: References: From: Noam Postavsky Date: Sat, 4 Jun 2016 18:50:32 -0400 X-Google-Sender-Auth: ZxzyNwz52K1eBM6MLErugXcd1fA Message-ID: Subject: Re: Processed (with 1 errors): 24.3.50; match data is incorrect if there are too many groups To: GNU bug tracker automated control server Content-Type: text/plain; charset=UTF-8 X-Spam-Score: -0.5 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.5 (/) # On Sat, Jun 4, 2016 at 6:48 PM, GNU bug tracker automated control server wrote: # >> tag + 17373 confirmed # > Unknown command or malformed arguments to command. tag 17373 + confirmed quit