GNU bug report logs -
#17373
24.3.50; match data is incorrect if there are too many groups
Previous Next
To reply to this bug, email your comments to 17373 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#17373
; Package
emacs
.
(Tue, 29 Apr 2014 19:20:03 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Nicolas Richard <theonewiththeevillook <at> yahoo.fr>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Tue, 29 Apr 2014 19:20:03 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hi,
The following reports 2. Replace 255 by 254, and it'll report 512 as expected
#+BEGIN_SRC emacs-lisp
(with-temp-buffer
(insert "bar")
(when
(re-search-backward
(concat
(mapconcat (lambda (x) (format "\\(%s\\)" x)) (make-list 255 "foo") "\\|")
"\\|"
"\\(bar\\)")
nil t)
(length (match-data))))
#+END_SRC
Regexps with many groups is the kind of thing is used in AUCTeX, in
TeX-auto-parse-region. What auctex does in that function is construct a
big regexp out of a list of smaller ones (each small one is made into a
group) ; then when the big regexp matches it then tries to find out
which of the smaller regexps actually matched by checking which group is
non-nil.
In GNU Emacs 24.3.50.7 (i686-pc-linux-gnu, GTK+ Version 2.24.20)
of 2014-04-10 on LDLC-portable
Windowing system distributor `The X.Org Foundation', version 11.0.11405000
System Description: Ubuntu 13.10
Configured using:
`configure 'CFLAGS=-g3 -O2''
Important settings:
value of $LANG: fr_BE.UTF-8
locale-coding-system: utf-8-unix
--
Nico.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#17373
; Package
emacs
.
(Mon, 19 May 2014 05:48:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 17373 <at> debbugs.gnu.org (full text, mbox):
Yes, unfortunately Emacs currently has a limit of at most 256 groups of
match data: one for the entire pattern, and 255 for parenthesized
subpatterns. If you go over the limit, the excess matches are silently
discarded. I don't see this limitation documented anywhere; it should
be. Or better yet, the limitation should be removed.
The limitation is wired into the representation of the 'start_memory'
code in compiled regular expressions: this code has a one-byte operand.
As far as I know, the limitation is specific to Emacs, and is not
present in the Gnulib or glibc versions of the regexp matcher.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#17373
; Package
emacs
.
(Mon, 19 May 2014 13:49:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 17373 <at> debbugs.gnu.org (full text, mbox):
> Yes, unfortunately Emacs currently has a limit of at most 256 groups of
> match data: one for the entire pattern, and 255 for parenthesized
> subpatterns. If you go over the limit, the excess matches are silently
> discarded. I don't see this limitation documented anywhere; it should
> be. Or better yet, the limitation should be removed.
Good to know. +1, to documenting it, at least.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#17373
; Package
emacs
.
(Wed, 10 Feb 2016 17:12:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 17373 <at> debbugs.gnu.org (full text, mbox):
On 2014-05-19, at 07:48, Drew Adams <drew.adams <at> oracle.com> wrote:
>> Yes, unfortunately Emacs currently has a limit of at most 256 groups of
>> match data: one for the entire pattern, and 255 for parenthesized
>> subpatterns. If you go over the limit, the excess matches are silently
>> discarded. I don't see this limitation documented anywhere; it should
>> be. Or better yet, the limitation should be removed.
>
> Good to know. +1, to documenting it, at least.
I can write a patch to the manual, but I'm a bit afraid that if this
gets documented, the limit will stay there forever. Is there a chance
of someone fluent in C to fix this?
(Incidentally, I have one package of mine where this limit could strike,
too.)
Best,
--
Marcin Borkowski
bug Marked as found in versions 25.0.94.
Request was from
Noam Postavsky <npostavs <at> users.sourceforge.net>
to
control <at> debbugs.gnu.org
.
(Sat, 04 Jun 2016 22:48:02 GMT)
Full text and
rfc822 format available.
Severity set to 'minor' from 'normal'
Request was from
Noam Postavsky <npostavs <at> users.sourceforge.net>
to
control <at> debbugs.gnu.org
.
(Sat, 04 Jun 2016 22:48:02 GMT)
Full text and
rfc822 format available.
Added tag(s) confirmed.
Request was from
Noam Postavsky <npostavs <at> users.sourceforge.net>
to
control <at> debbugs.gnu.org
.
(Sat, 04 Jun 2016 22:51:02 GMT)
Full text and
rfc822 format available.
This bug report was last modified 9 years and 9 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.