GNU bug report logs - #24751
26.0.50; Regex stack overflow not detected properly (gets "Variable binding depth exceeds max-specpdl-size")

Previous Next

Package: emacs;

Reported by: npostavs <at> users.sourceforge.net

Date: Fri, 21 Oct 2016 03:54:01 UTC

Severity: normal

Tags: fixed, patch

Found in version 26.0.50

Fixed in version 26.1

Done: npostavs <at> users.sourceforge.net

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 24751 in the body.
You can then email your comments to 24751 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Fri, 21 Oct 2016 03:54:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to npostavs <at> users.sourceforge.net:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Fri, 21 Oct 2016 03:54:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: npostavs <at> users.sourceforge.net
To: bug-gnu-emacs <at> gnu.org
Subject: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Thu, 20 Oct 2016 23:54:05 -0400
[Message part 1 (text/plain, inline)]
This is a followup from Bug #24358, since that thread filled up with
messages about the crash in 25.1, so I'm opening a new bug for the wrong
regex stack overflow check (this affects only the master branch).

To reproduce the error run (re-search-forward ".*\\(\n.*\\)*" nil t) in
a buffer with contents of the attached bug-24358-regex-max-specpdl.txt.

 
[bug-24358-regex-max-specpdl.txt (text/plain, attachment)]
[Message part 3 (text/plain, inline)]
Here is my analysis from
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=24358#27 again:

[Message part 4 (message/rfc822, inline)]
Subject: bug#24358: 25.1.50; re-search-forward errors with "Variable binding depth exceeds max-specpdl-size"
Date: Sat, 03 Sep 2016 11:43:16 -0400
> (I'm also on GNU/Linux, Arch) I get the same max-specpdl-size error
> with 25.1.50 [this refers to the master branch which is now 26.0.50
> (emacs-25 was 25.0.xx at the time)].  With emacs version 24.5 (and
> below) I get (error "Stack overflow in regexp matcher") [as expected].

The problem is that re_max_failures is set to 40001 (instead of the
original 40000) in main()[1], which is a problem because of the
GROW_FAIL_STACK uses (re_max_failures * TYPICAL_FAILURE_SIZE) as a cap
on the amount to allocate, but ((fail_stack).size * sizeof
(fail_stack_elt_t)) to calculate current allocation.

Since TYPICAL_FAILURE_SIZE = 20 and sizeof (fail_stack_elt_t) = 8, and
it seems that (fail_stack).size grows in increments of 3, when
(fail_stack).avail is 99999 and (fail_stack).size reaches 100002:

(fail_stack).size * sizeof (fail_stack_elt_t) => 800016
  re_max_failures * TYPICAL_FAILURE_SIZE      => 800020

ENSURE_FAIL_STACK(3) then loops indefinitely reallocating a stack of
size 800020 again and again until the record_xmalloc fails to
grow_specdl() (thus the "max-specpdl-size" error).

----------

So we we might want to fix the re_max_failures setting in main, but it
doesn't quite make sense to me that GROW_FAIL_STACK relies on
re_max_failures being a multiple of (sizeof (fail_stack_elt_t)).  At the
definition of TYPICAL_FAILURE_SIZE we have

/* Estimate the size of data pushed by a typical failure stack entry.
   An estimate is all we need, because all we use this for
   is to choose a limit for how big to make the failure stack.  */
/* BEWARE, the value `20' is hard-coded in emacs.c:main().  */
#define TYPICAL_FAILURE_SIZE 20

Why do we use an "estimate" here?  What's wrong with just using
(re_max_failures * sizeof (fail_stack_elt_t)) as the limit?  Or should
the limit actually be (re_max_failures * TYPICAL_FAILURE_SIZE * sizeof
(fail_stack_elt_t))?

-----------

827	      long lim = rlim.rlim_cur;
(gdb) p rlim
$1 = {
  rlim_cur = 8388608, 
  rlim_max = 18446744073709551615
}
(gdb) next
833	      int ratio = 20 * sizeof (char *);
(gdb) 
834	      ratio += ratio / 3;
(gdb) 
837	      int extra = 200000;
(gdb) p ratio
$2 = 213
[...]
(gdb) display ((newlim - extra) / ratio)
1: ((newlim - extra) / ratio) = 40000
(gdb) next
856		  newlim += pagesize - 1;
1: ((newlim - extra) / ratio) = 40000
(gdb) 
857		  if (0 <= rlim.rlim_max && rlim.rlim_max < newlim)
1: ((newlim - extra) / ratio) = 40019
(gdb) 
859		  newlim -= newlim % pagesize;
1: ((newlim - extra) / ratio) = 40019
(gdb) 
861		  if (pagesize <= newlim - lim)
1: ((newlim - extra) / ratio) = 40001
(gdb) undisplay 1
(gdb) next
863		      rlim.rlim_cur = newlim;
(gdb) 
864		      if (setrlimit (RLIMIT_STACK, &rlim) == 0)
(gdb) 
865			lim = newlim;
(gdb) 
870	      re_max_failures = lim < extra ? 0 : min (lim - extra, SIZE_MAX) / ratio;
(gdb) 
875	  stack_bottom = &stack_bottom_variable;
(gdb) p re_max_failures
$3 = 40001

-----------

[1]: This was the case since 9d356f62 2016-05-27 "Robustify stack-size calculation"


Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Fri, 04 Nov 2016 08:22:02 GMT) Full text and rfc822 format available.

Message #8 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: npostavs <at> users.sourceforge.net
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Fri, 04 Nov 2016 10:22:08 +0200
> From: npostavs <at> users.sourceforge.net
> Date: Thu, 20 Oct 2016 23:54:05 -0400
> 
> So we we might want to fix the re_max_failures setting in main, but it
> doesn't quite make sense to me that GROW_FAIL_STACK relies on
> re_max_failures being a multiple of (sizeof (fail_stack_elt_t)).  At the
> definition of TYPICAL_FAILURE_SIZE we have
> 
> /* Estimate the size of data pushed by a typical failure stack entry.
>    An estimate is all we need, because all we use this for
>    is to choose a limit for how big to make the failure stack.  */
> /* BEWARE, the value `20' is hard-coded in emacs.c:main().  */
> #define TYPICAL_FAILURE_SIZE 20
> 
> Why do we use an "estimate" here?  What's wrong with just using
> (re_max_failures * sizeof (fail_stack_elt_t)) as the limit?  Or should
> the limit actually be (re_max_failures * TYPICAL_FAILURE_SIZE * sizeof
> (fail_stack_elt_t))?

I think it should be the latter, indeed.

Can you propose a patch along those lines that would remove the
infloop in ENSURE_FAIL_STACK?

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Sat, 05 Nov 2016 19:34:02 GMT) Full text and rfc822 format available.

Message #11 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: npostavs <at> users.sourceforge.net
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Sat, 05 Nov 2016 15:34:29 -0400
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: npostavs <at> users.sourceforge.net
>> Date: Thu, 20 Oct 2016 23:54:05 -0400
>> 
>> So we we might want to fix the re_max_failures setting in main, but it
>> doesn't quite make sense to me that GROW_FAIL_STACK relies on
>> re_max_failures being a multiple of (sizeof (fail_stack_elt_t)).  At the
>> definition of TYPICAL_FAILURE_SIZE we have
>> 
>> /* Estimate the size of data pushed by a typical failure stack entry.
>>    An estimate is all we need, because all we use this for
>>    is to choose a limit for how big to make the failure stack.  */
>> /* BEWARE, the value `20' is hard-coded in emacs.c:main().  */
>> #define TYPICAL_FAILURE_SIZE 20
>> 
>> Why do we use an "estimate" here?  What's wrong with just using
>> (re_max_failures * sizeof (fail_stack_elt_t)) as the limit?  Or should
>> the limit actually be (re_max_failures * TYPICAL_FAILURE_SIZE * sizeof
>> (fail_stack_elt_t))?
>
> I think it should be the latter, indeed.
>
> Can you propose a patch along those lines that would remove the
> infloop in ENSURE_FAIL_STACK?
>
> Thanks.

The below seems to work, but effectively increases the size of the
failure stack (so the sample file size has to be increased 8-fold to get
a regex stack overflow).  Strangely, changing the value in the
definition of re_max_failures doesn't seem to have any effect, it stays
40000 regardless.  I am quite confused.

diff --git i/src/regex.c w/src/regex.c
index 1c6c9e5..163c5b4 100644
--- i/src/regex.c
+++ w/src/regex.c
@@ -1320,19 +1320,22 @@ WEAK_ALIAS (__re_set_syntax, re_set_syntax)
 
 #define GROW_FAIL_STACK(fail_stack)					\
   (((fail_stack).size * sizeof (fail_stack_elt_t)			\
-    >= re_max_failures * TYPICAL_FAILURE_SIZE)				\
+    >= re_max_failures * sizeof (fail_stack_elt_t)                      \
+    * TYPICAL_FAILURE_SIZE)                                             \
    ? 0									\
    : ((fail_stack).stack						\
       = REGEX_REALLOCATE_STACK ((fail_stack).stack,			\
 	  (fail_stack).size * sizeof (fail_stack_elt_t),		\
-	  min (re_max_failures * TYPICAL_FAILURE_SIZE,			\
+	  min (re_max_failures * sizeof (fail_stack_elt_t)              \
+               * TYPICAL_FAILURE_SIZE,                                  \
 	       ((fail_stack).size * sizeof (fail_stack_elt_t)		\
 		* FAIL_STACK_GROWTH_FACTOR))),				\
 									\
       (fail_stack).stack == NULL					\
       ? 0								\
       : ((fail_stack).size						\
-	 = (min (re_max_failures * TYPICAL_FAILURE_SIZE,		\
+         = (min (re_max_failures * sizeof (fail_stack_elt_t)            \
+                 * TYPICAL_FAILURE_SIZE,                                \
 		 ((fail_stack).size * sizeof (fail_stack_elt_t)		\
 		  * FAIL_STACK_GROWTH_FACTOR))				\
 	    / sizeof (fail_stack_elt_t)),				\






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Sun, 06 Nov 2016 15:46:02 GMT) Full text and rfc822 format available.

Message #14 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: npostavs <at> users.sourceforge.net
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Sun, 06 Nov 2016 17:45:40 +0200
> From: npostavs <at> users.sourceforge.net
> Cc: 24751 <at> debbugs.gnu.org
> Date: Sat, 05 Nov 2016 15:34:29 -0400
> 
> >> #define TYPICAL_FAILURE_SIZE 20
> >> 
> >> Why do we use an "estimate" here?  What's wrong with just using
> >> (re_max_failures * sizeof (fail_stack_elt_t)) as the limit?  Or should
> >> the limit actually be (re_max_failures * TYPICAL_FAILURE_SIZE * sizeof
> >> (fail_stack_elt_t))?
> >
> > I think it should be the latter, indeed.
> >
> > Can you propose a patch along those lines that would remove the
> > infloop in ENSURE_FAIL_STACK?
> >
> > Thanks.
> 
> The below seems to work

Thanks.

I think the patch can be simplified, where we now multiply by the size
of fail_stack_elt_t and then divide by it: simply remove both the
multiplication and the division.  That will make the code easier to
read, and will make the units of each variable clear, something that I
think is at the heart of this issue.

> but effectively increases the size of the failure stack (so the
> sample file size has to be increased 8-fold to get a regex stack
> overflow).

Which IMO is exactly TRT, since re_max_failures was computed given the
runtime stack size of 8MB, so having it bail out after merely 800KB
doesn't sound right to me, don't you agree?

> Strangely, changing the value in the definition of re_max_failures
> doesn't seem to have any effect, it stays 40000 regardless.  I am
> quite confused.

I don't think I follow.  Can you tell what you tried to change, and
where did you see the lack of any effect?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Sun, 13 Nov 2016 05:40:01 GMT) Full text and rfc822 format available.

Message #17 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: npostavs <at> users.sourceforge.net
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Sun, 13 Nov 2016 00:39:39 -0500
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:

>
> I think the patch can be simplified, where we now multiply by the size
> of fail_stack_elt_t and then divide by it: simply remove both the
> multiplication and the division.  That will make the code easier to
> read, and will make the units of each variable clear, something that I
> think is at the heart of this issue.

Ah, right.

[v2-0001-Fix-computation-of-regex-stack-limit.patch (text/plain, attachment)]
[Message part 3 (text/plain, inline)]

>
>> but effectively increases the size of the failure stack (so the
>> sample file size has to be increased 8-fold to get a regex stack
>> overflow).
>
> Which IMO is exactly TRT, since re_max_failures was computed given the
> runtime stack size of 8MB, so having it bail out after merely 800KB
> doesn't sound right to me, don't you agree?

Yes, I suppose we should also try to make use of the stack, rather than
calling malloc, right?  Something like this:

diff --git i/src/regex.c w/src/regex.c
index d23ba01..dcabde5 100644
--- i/src/regex.c
+++ w/src/regex.c
@@ -447,7 +447,11 @@ init_syntax_once (void)
 #else /* not REGEX_MALLOC  */
 
 # ifdef emacs
-#  define REGEX_USE_SAFE_ALLOCA USE_SAFE_ALLOCA
+#  define REGEX_USE_SAFE_ALLOCA                                         \
+  ptrdiff_t sa_avail = re_max_failures                                  \
+    * TYPICAL_FAILURE_SIZE * sizeof (fail_stack_elt_t);                 \
+  ptrdiff_t sa_count = SPECPDL_INDEX (); bool sa_must_free = false
+
 #  define REGEX_SAFE_FREE() SAFE_FREE ()
 #  define REGEX_ALLOCATE SAFE_ALLOCA
 # else



>
>> Strangely, changing the value in the definition of re_max_failures
>> doesn't seem to have any effect, it stays 40000 regardless.  I am
>> quite confused.
>
> I don't think I follow.  Can you tell what you tried to change, and
> where did you see the lack of any effect?

Modifying the initial value of re_max_failures as in the below patch,
has no effect on the size of the file at which stack regex overflow
happens (I confirmed it gets compiled by adding a #warning on the
previous line).  Printing re_max_failures in gdb still shows 40000.

diff --git i/src/regex.c w/src/regex.c
index d23ba01..d9170c0 100644
--- i/src/regex.c
+++ w/src/regex.c
@@ -1249,7 +1249,7 @@ WEAK_ALIAS (__re_set_syntax, re_set_syntax)
    whose default stack limit is 2mb.  In order for a larger
    value to work reliably, you have to try to make it accord
    with the process stack limit.  */
-size_t re_max_failures = 40000;
+size_t re_max_failures = 20;
 # else
 size_t re_max_failures = 4000;
 # endif


Actually I find Emacs still compiles if I removed that line completely,
there's just a compile warning saying

    regex.o: In function `re_match_2_internal':
    /home/npostavs/src/emacs/emacs-master/lib-src/../src/regex.c:5529: warning: the 're_max_failures' variable is obsolete and will go away.

I guess there's some kind of definition of it in libc?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Sun, 13 Nov 2016 16:13:01 GMT) Full text and rfc822 format available.

Message #20 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: npostavs <at> users.sourceforge.net
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Sun, 13 Nov 2016 18:12:47 +0200
> From: npostavs <at> users.sourceforge.net
> Cc: 24751 <at> debbugs.gnu.org
> Date: Sun, 13 Nov 2016 00:39:39 -0500
> 
> > I think the patch can be simplified, where we now multiply by the size
> > of fail_stack_elt_t and then divide by it: simply remove both the
> > multiplication and the division.  That will make the code easier to
> > read, and will make the units of each variable clear, something that I
> > think is at the heart of this issue.
> 
> Ah, right.

Thanks, LGTM.

> >> but effectively increases the size of the failure stack (so the
> >> sample file size has to be increased 8-fold to get a regex stack
> >> overflow).
> >
> > Which IMO is exactly TRT, since re_max_failures was computed given the
> > runtime stack size of 8MB, so having it bail out after merely 800KB
> > doesn't sound right to me, don't you agree?
> 
> Yes, I suppose we should also try to make use of the stack, rather than
> calling malloc, right?  Something like this:
> 
> diff --git i/src/regex.c w/src/regex.c
> index d23ba01..dcabde5 100644
> --- i/src/regex.c
> +++ w/src/regex.c
> @@ -447,7 +447,11 @@ init_syntax_once (void)
>  #else /* not REGEX_MALLOC  */
>  
>  # ifdef emacs
> -#  define REGEX_USE_SAFE_ALLOCA USE_SAFE_ALLOCA
> +#  define REGEX_USE_SAFE_ALLOCA                                         \
> +  ptrdiff_t sa_avail = re_max_failures                                  \
> +    * TYPICAL_FAILURE_SIZE * sizeof (fail_stack_elt_t);                 \
> +  ptrdiff_t sa_count = SPECPDL_INDEX (); bool sa_must_free = false
> +

Yes.  And please also add a comment there saying that this replaces
USE_SAFE_ALLOCA.

> -size_t re_max_failures = 40000;
> +size_t re_max_failures = 20;
>  # else
>  size_t re_max_failures = 4000;
>  # endif
> 
> 
> Actually I find Emacs still compiles if I removed that line completely,
> there's just a compile warning saying
> 
>     regex.o: In function `re_match_2_internal':
>     /home/npostavs/src/emacs/emacs-master/lib-src/../src/regex.c:5529: warning: the 're_max_failures' variable is obsolete and will go away.
> 
> I guess there's some kind of definition of it in libc?

Most probably.  You should be able to see that using "nm -A".  If
that's indeed so, I think we should rename that variable to something
like emacs_re_max_failures, to avoid stomping on the libc variable..




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Tue, 15 Nov 2016 03:08:02 GMT) Full text and rfc822 format available.

Message #23 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: npostavs <at> users.sourceforge.net
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Mon, 14 Nov 2016 22:08:18 -0500
Eli Zaretskii <eliz <at> gnu.org> writes:

>> 
>> Yes, I suppose we should also try to make use of the stack, rather than
>> calling malloc, right?  Something like this:
>> 
>> diff --git i/src/regex.c w/src/regex.c
>> index d23ba01..dcabde5 100644
>> --- i/src/regex.c
>> +++ w/src/regex.c
>> @@ -447,7 +447,11 @@ init_syntax_once (void)
>>  #else /* not REGEX_MALLOC  */
>>  
>>  # ifdef emacs
>> -#  define REGEX_USE_SAFE_ALLOCA USE_SAFE_ALLOCA
>> +#  define REGEX_USE_SAFE_ALLOCA                                         \
>> +  ptrdiff_t sa_avail = re_max_failures                                  \
>> +    * TYPICAL_FAILURE_SIZE * sizeof (fail_stack_elt_t);                 \
>> +  ptrdiff_t sa_count = SPECPDL_INDEX (); bool sa_must_free = false
>> +
>
> Yes.  And please also add a comment there saying that this replaces
> USE_SAFE_ALLOCA.

Actually, we should avoid increasing this limit if the stack wasn't
increased, right?  Here's what I came up with, I think it doesn't cover
Cygwin/Windows though.

diff --git c/src/emacs.c i/src/emacs.c
index b74df21..d4655c8 100644
--- c/src/emacs.c
+++ i/src/emacs.c
@@ -831,8 +831,8 @@ main (int argc, char **argv)
 	 re_max_failures, then add 33% to cover the size of the
 	 smaller stacks that regex.c successively allocates and
 	 discards on its way to the maximum.  */
-      int ratio = 20 * sizeof (char *);
-      ratio += ratio / 3;
+      int min_ratio = 20 * sizeof (char *);
+      int ratio = min_ratio + min_ratio / 3;
 
       /* Extra space to cover what we're likely to use for other reasons.  */
       int extra = 200000;
@@ -869,6 +869,7 @@ main (int argc, char **argv)
 
       /* Don't let regex.c overflow the stack.  */
       re_max_failures = lim < extra ? 0 : min (lim - extra, SIZE_MAX) / ratio;
+      emacs_re_safe_alloca = re_max_failures * min_ratio;
     }
 #endif /* HAVE_SETRLIMIT and RLIMIT_STACK and not CYGWIN */
 
diff --git c/src/regex.c i/src/regex.c
index d23ba01..56cffa1 100644
--- c/src/regex.c
+++ i/src/regex.c
@@ -447,7 +447,13 @@ init_syntax_once (void)
 #else /* not REGEX_MALLOC  */
 
 # ifdef emacs
-#  define REGEX_USE_SAFE_ALLOCA USE_SAFE_ALLOCA
+/* This may be adjusted in main(), if the stack is successfully grown.  */
+ptrdiff_t emacs_re_safe_alloca = MAX_ALLOCA;
+/* Like USE_SAFE_ALLOCA, but use emacs_re_safe_alloca.  */
+#  define REGEX_USE_SAFE_ALLOCA                                        \
+  ptrdiff_t sa_avail = emacs_re_safe_alloca;                           \
+  ptrdiff_t sa_count = SPECPDL_INDEX (); bool sa_must_free = false
+
 #  define REGEX_SAFE_FREE() SAFE_FREE ()
 #  define REGEX_ALLOCATE SAFE_ALLOCA
 # else
diff --git c/src/regex.h i/src/regex.h
index 4922440..45cbe0a 100644
--- c/src/regex.h
+++ i/src/regex.h
@@ -187,6 +187,11 @@
 /* Roughly the maximum number of failure points on the stack.  */
 extern size_t re_max_failures;
 
+#ifdef emacs
+/* Amount of memory that we can safely stack allocate.  */
+extern ptrdiff_t emacs_re_safe_alloca;
+#endif
+
 
 /* Define combinations of the above bits for the standard possibilities.
    (The [[[ comments delimit what gets put into the Texinfo file, so


>> 
>> 
>> Actually I find Emacs still compiles if I removed that line completely,
>> there's just a compile warning saying
>> 
>>     regex.o: In function `re_match_2_internal':
>>     /home/npostavs/src/emacs/emacs-master/lib-src/../src/regex.c:5529: warning: the 're_max_failures' variable is obsolete and will go away.
>> 
>> I guess there's some kind of definition of it in libc?
>
> Most probably.  You should be able to see that using "nm -A".  If
> that's indeed so, I think we should rename that variable to something
> like emacs_re_max_failures, to avoid stomping on the libc variable..

Ah, right:

    $ nm -A /usr/lib/libc.so.6 | grep re_max_failures
    /usr/lib/libc.so.6:0000000000000000 n __evoke_link_warning_re_max_failures
    /usr/lib/libc.so.6:00000000003981d8 D re_max_failures





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Tue, 15 Nov 2016 16:13:02 GMT) Full text and rfc822 format available.

Message #26 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: npostavs <at> users.sourceforge.net
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Tue, 15 Nov 2016 18:12:09 +0200
> From: npostavs <at> users.sourceforge.net
> Cc: 24751 <at> debbugs.gnu.org
> Date: Mon, 14 Nov 2016 22:08:18 -0500
> 
> Actually, we should avoid increasing this limit if the stack wasn't
> increased, right?  Here's what I came up with, I think it doesn't cover
> Cygwin/Windows though.
> 
> diff --git c/src/emacs.c i/src/emacs.c
> index b74df21..d4655c8 100644
> --- c/src/emacs.c
> +++ i/src/emacs.c
> @@ -831,8 +831,8 @@ main (int argc, char **argv)
>  	 re_max_failures, then add 33% to cover the size of the
>  	 smaller stacks that regex.c successively allocates and
>  	 discards on its way to the maximum.  */
> -      int ratio = 20 * sizeof (char *);
> -      ratio += ratio / 3;
> +      int min_ratio = 20 * sizeof (char *);
> +      int ratio = min_ratio + min_ratio / 3;
>  
>        /* Extra space to cover what we're likely to use for other reasons.  */
>        int extra = 200000;
> @@ -869,6 +869,7 @@ main (int argc, char **argv)
>  
>        /* Don't let regex.c overflow the stack.  */
>        re_max_failures = lim < extra ? 0 : min (lim - extra, SIZE_MAX) / ratio;
> +      emacs_re_safe_alloca = re_max_failures * min_ratio;
>      }
>  #endif /* HAVE_SETRLIMIT and RLIMIT_STACK and not CYGWIN */

Right, but I have 2 comments:

  . we shouldn't set re_max_failures to zero if the amount of stack is
    less than 'extra', since in that case we will allocate the failure
    stack off the heap;
  . emacs_re_safe_alloca should have its minimum value MAX_ALLOCA, not
    zero, because SAFE_ALLOCA can still be used in regex.c, even
    though the failure stack will be malloc'ed.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Wed, 16 Nov 2016 01:06:01 GMT) Full text and rfc822 format available.

Message #29 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: npostavs <at> users.sourceforge.net
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Tue, 15 Nov 2016 20:06:29 -0500
>> @@ -869,6 +869,7 @@ main (int argc, char **argv)
>>  
>>        /* Don't let regex.c overflow the stack.  */
>>        re_max_failures = lim < extra ? 0 : min (lim - extra, SIZE_MAX) / ratio;
>> +      emacs_re_safe_alloca = re_max_failures * min_ratio;
>>      }
>>  #endif /* HAVE_SETRLIMIT and RLIMIT_STACK and not CYGWIN */
>
>   . we shouldn't set re_max_failures to zero if the amount of stack is
>     less than 'extra', since in that case we will allocate the failure
>     stack off the heap;

Then what should we set it to?  Maybe we shouldn't modify it at all,
since the stack isn't actually a limiting factor?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Wed, 16 Nov 2016 16:26:02 GMT) Full text and rfc822 format available.

Message #32 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: npostavs <at> users.sourceforge.net
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Wed, 16 Nov 2016 18:25:22 +0200
> From: npostavs <at> users.sourceforge.net
> Cc: 24751 <at> debbugs.gnu.org
> Date: Tue, 15 Nov 2016 20:06:29 -0500
> 
> >> @@ -869,6 +869,7 @@ main (int argc, char **argv)
> >>  
> >>        /* Don't let regex.c overflow the stack.  */
> >>        re_max_failures = lim < extra ? 0 : min (lim - extra, SIZE_MAX) / ratio;
> >> +      emacs_re_safe_alloca = re_max_failures * min_ratio;
> >>      }
> >>  #endif /* HAVE_SETRLIMIT and RLIMIT_STACK and not CYGWIN */
> >
> >   . we shouldn't set re_max_failures to zero if the amount of stack is
> >     less than 'extra', since in that case we will allocate the failure
> >     stack off the heap;
> 
> Then what should we set it to?  Maybe we shouldn't modify it at all,
> since the stack isn't actually a limiting factor?

Yes, I think this is the best solution.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Wed, 16 Nov 2016 23:25:01 GMT) Full text and rfc822 format available.

Message #35 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: npostavs <at> users.sourceforge.net
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Wed, 16 Nov 2016 18:25:22 -0500
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: npostavs <at> users.sourceforge.net
>> Cc: 24751 <at> debbugs.gnu.org
>> Date: Tue, 15 Nov 2016 20:06:29 -0500
>> 
>> >> @@ -869,6 +869,7 @@ main (int argc, char **argv)
>> >>  
>> >>        /* Don't let regex.c overflow the stack.  */
>> >>        re_max_failures = lim < extra ? 0 : min (lim - extra, SIZE_MAX) / ratio;
>> >> +      emacs_re_safe_alloca = re_max_failures * min_ratio;
>> >>      }
>> >>  #endif /* HAVE_SETRLIMIT and RLIMIT_STACK and not CYGWIN */
>> >
>> >   . we shouldn't set re_max_failures to zero if the amount of stack is
>> >     less than 'extra', since in that case we will allocate the failure
>> >     stack off the heap;
>> 
>> Then what should we set it to?  Maybe we shouldn't modify it at all,
>> since the stack isn't actually a limiting factor?
>
> Yes, I think this is the best solution.
>

One more question, is this comment (around line 1198) now obsolete?  (if
not, it sounds like we might still have some serious problems)

/* Define MATCH_MAY_ALLOCATE unless we need to make sure that the
   searching and matching functions should not call alloca.  On some
   systems, alloca is implemented in terms of malloc, and if we're
   using the relocating allocator routines, then malloc could cause a
   relocation, which might (if the strings being searched are in the
   ralloc heap) shift the data out from underneath the regexp
   routines.

   Here's another reason to avoid allocation: Emacs
   processes input from X in a signal handler; processing X input may
   call malloc; if input arrives while a matching routine is calling
   malloc, then we're scrod.  But Emacs can't just block input while
   calling matching routines; then we don't notice interrupts when
   they come in.  So, Emacs blocks input around all regexp calls
   except the matching calls, which it leaves unprotected, in the
   faith that they will not malloc.  */

Also this one (around line 430)

/* Should we use malloc or alloca?  If REGEX_MALLOC is not defined, we
   use `alloca' instead of `malloc'.  This is because using malloc in
   re_search* or re_match* could cause memory leaks when C-g is used in
   Emacs; also, malloc is slower and causes storage fragmentation.  On
   the other hand, malloc is more portable, and easier to debug.

   Because we sometimes use alloca, some routines have to be macros,
   not functions -- `alloca'-allocated space disappears at the end of the
   function it is called in.  */





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Thu, 17 Nov 2016 16:22:02 GMT) Full text and rfc822 format available.

Message #38 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: npostavs <at> users.sourceforge.net
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Thu, 17 Nov 2016 18:21:24 +0200
> From: npostavs <at> users.sourceforge.net
> Cc: 24751 <at> debbugs.gnu.org
> Date: Wed, 16 Nov 2016 18:25:22 -0500
> 
> One more question, is this comment (around line 1198) now obsolete?  (if
> not, it sounds like we might still have some serious problems)
> 
> /* Define MATCH_MAY_ALLOCATE unless we need to make sure that the
>    searching and matching functions should not call alloca.  On some
>    systems, alloca is implemented in terms of malloc, and if we're
>    using the relocating allocator routines, then malloc could cause a
>    relocation, which might (if the strings being searched are in the
>    ralloc heap) shift the data out from underneath the regexp
>    routines.
> 
>    Here's another reason to avoid allocation: Emacs
>    processes input from X in a signal handler; processing X input may
>    call malloc; if input arrives while a matching routine is calling
>    malloc, then we're scrod.  But Emacs can't just block input while
>    calling matching routines; then we don't notice interrupts when
>    they come in.  So, Emacs blocks input around all regexp calls
>    except the matching calls, which it leaves unprotected, in the
>    faith that they will not malloc.  */

The second part is obsolete: we no longer do anything significant from
a signal handler, we just set a flag.

The first part is not obsolete, but its reasoning is backwards:
SAFE_ALLOCA indeed can call malloc, but it could only cause relocation
if REGEX_MALLOC is defined (and ralloc.c is compiled in).  And when
you define REGEX_MALLOC, MATCH_MAY_ALLOCATE is undefined.  So the text
there should be revised.

> Also this one (around line 430)
> 
> /* Should we use malloc or alloca?  If REGEX_MALLOC is not defined, we
>    use `alloca' instead of `malloc'.  This is because using malloc in
>    re_search* or re_match* could cause memory leaks when C-g is used in
>    Emacs; also, malloc is slower and causes storage fragmentation.  On
>    the other hand, malloc is more portable, and easier to debug.
> 
>    Because we sometimes use alloca, some routines have to be macros,
>    not functions -- `alloca'-allocated space disappears at the end of the
>    function it is called in.  */

This is correct AFAIU, but perhaps it's worth adding that even if
SAFE_ALLOCA decides to call malloc, it takes care to set up
unwind-protect scheme that will free the allocated memory upon C-g (or
any other throw-type op), and avoid leaking memory.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Sat, 19 Nov 2016 10:03:02 GMT) Full text and rfc822 format available.

Message #41 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: npostavs <at> users.sourceforge.net
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Sat, 19 Nov 2016 12:02:14 +0200
One other comment: the value of 'extra' in emacs.c looks too small to
me.  If we want to safely allocate the failure stack off the run-time
C stack, we should make 'extra' large enough to cover a typical GC,
which can take as many as 30K stack frames.  So I'd enlarge 'extra' to
be something like 1500000.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Sun, 01 Jan 2017 18:33:01 GMT) Full text and rfc822 format available.

Message #44 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: npostavs <at> users.sourceforge.net
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Sun, 01 Jan 2017 13:33:35 -0500
Eli Zaretskii <eliz <at> gnu.org> writes:
>> 
>> /* Define MATCH_MAY_ALLOCATE unless we need to make sure that the
>>    searching and matching functions should not call alloca.  On some
>>    systems, alloca is implemented in terms of malloc, and if we're
>>    using the relocating allocator routines, then malloc could cause a
>>    relocation, which might (if the strings being searched are in the
>>    ralloc heap) shift the data out from underneath the regexp
>>    routines.
>> 
>>    [...]
>
> The first part is not obsolete, but its reasoning is backwards:
> SAFE_ALLOCA indeed can call malloc, but it could only cause relocation
> if REGEX_MALLOC is defined (and ralloc.c is compiled in).  And when
> you define REGEX_MALLOC, MATCH_MAY_ALLOCATE is undefined.  So the text
> there should be revised.

Is there ever any case where REGEX_MALLOC is defined?  I can't see where
it happens.  I don't understand why you say relocation is dependent on
REGEX_MALLOC, I thought only REL_ALLOC affects that.

And since we added r_alloc_inhibit_buffer_relocation around the regex
calls, aren't all these concerns about relocation obsolete?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Sun, 01 Jan 2017 18:43:02 GMT) Full text and rfc822 format available.

Message #47 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: npostavs <at> users.sourceforge.net
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Sun, 01 Jan 2017 20:41:57 +0200
> From: npostavs <at> users.sourceforge.net
> Cc: 24751 <at> debbugs.gnu.org
> Date: Sun, 01 Jan 2017 13:33:35 -0500
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> >> 
> >> /* Define MATCH_MAY_ALLOCATE unless we need to make sure that the
> >>    searching and matching functions should not call alloca.  On some
> >>    systems, alloca is implemented in terms of malloc, and if we're
> >>    using the relocating allocator routines, then malloc could cause a
> >>    relocation, which might (if the strings being searched are in the
> >>    ralloc heap) shift the data out from underneath the regexp
> >>    routines.
> >> 
> >>    [...]
> >
> > The first part is not obsolete, but its reasoning is backwards:
> > SAFE_ALLOCA indeed can call malloc, but it could only cause relocation
> > if REGEX_MALLOC is defined (and ralloc.c is compiled in).  And when
> > you define REGEX_MALLOC, MATCH_MAY_ALLOCATE is undefined.  So the text
> > there should be revised.
> 
> Is there ever any case where REGEX_MALLOC is defined?  I can't see where
> it happens.

I don't understand the question.  You can compile regex.c with the
"-DREGEX_MALLOC" option whenever you like.  We don't do that, but as
long as the code which supports that is in regex.c, the comment goes
with it.

> I don't understand why you say relocation is dependent on
> REGEX_MALLOC, I thought only REL_ALLOC affects that.

REL_ALLOC determines whether ralloc.c is compiled in, which I
mentioned above.

> And since we added r_alloc_inhibit_buffer_relocation around the regex
> calls, aren't all these concerns about relocation obsolete?

The calls to r_alloc_inhibit_buffer_relocation are outside of regex.c,
so the comments in regex.c don't know anything about that.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Sun, 01 Jan 2017 18:57:01 GMT) Full text and rfc822 format available.

Message #50 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: npostavs <at> users.sourceforge.net
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Sun, 01 Jan 2017 13:57:05 -0500
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: npostavs <at> users.sourceforge.net
>> Cc: 24751 <at> debbugs.gnu.org
>> Date: Sun, 01 Jan 2017 13:33:35 -0500
>> 
>> Eli Zaretskii <eliz <at> gnu.org> writes:
>> >> 
>> >> /* Define MATCH_MAY_ALLOCATE unless we need to make sure that the
>> >>    searching and matching functions should not call alloca.  On some
>> >>    systems, alloca is implemented in terms of malloc, and if we're
>> >>    using the relocating allocator routines, then malloc could cause a
>> >>    relocation, which might (if the strings being searched are in the
>> >>    ralloc heap) shift the data out from underneath the regexp
>> >>    routines.
>> >> 
>> >>    [...]
>> >
>> > The first part is not obsolete, but its reasoning is backwards:
>> > SAFE_ALLOCA indeed can call malloc, but it could only cause relocation
>> > if REGEX_MALLOC is defined (and ralloc.c is compiled in).  And when
>> > you define REGEX_MALLOC, MATCH_MAY_ALLOCATE is undefined.  So the text
>> > there should be revised.
>> 
>> Is there ever any case where REGEX_MALLOC is defined?  I can't see where
>> it happens.
>
> I don't understand the question.  You can compile regex.c with the
> "-DREGEX_MALLOC" option whenever you like.  We don't do that,
                                              ^^^^^^^^^^^^^^^^

Thanks, that's what I was wondering about in my question.

>
>> I don't understand why you say relocation is dependent on
>> REGEX_MALLOC, I thought only REL_ALLOC affects that.
>
> REL_ALLOC determines whether ralloc.c is compiled in, which I
> mentioned above.

But if REL_ALLOC is defined, then SAFE_ALLOCA could cause relocation
(via malloc) regardless of whether REGEX_MALLOC is defined or not, no?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Sun, 01 Jan 2017 20:08:02 GMT) Full text and rfc822 format available.

Message #53 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: npostavs <at> users.sourceforge.net
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Sun, 01 Jan 2017 22:06:54 +0200
> From: npostavs <at> users.sourceforge.net
> Cc: 24751 <at> debbugs.gnu.org
> Date: Sun, 01 Jan 2017 13:57:05 -0500
> 
> >> I don't understand why you say relocation is dependent on
> >> REGEX_MALLOC, I thought only REL_ALLOC affects that.
> >
> > REL_ALLOC determines whether ralloc.c is compiled in, which I
> > mentioned above.
> 
> But if REL_ALLOC is defined, then SAFE_ALLOCA could cause relocation
> (via malloc) regardless of whether REGEX_MALLOC is defined or not, no?

Relocation as side effect of calling malloc only happens with buffer
text.  This is not what the comment in question alludes to.  It
alludes to this:

  /* Define how to allocate the failure stack.  */

  #if defined REL_ALLOC && defined REGEX_MALLOC

  # define REGEX_ALLOCATE_STACK(size)				\
    r_alloc (&failure_stack_ptr, (size))
  # define REGEX_REALLOCATE_STACK(source, osize, nsize)		\
    r_re_alloc (&failure_stack_ptr, (nsize))
  # define REGEX_FREE_STACK(ptr)					\
    r_alloc_free (&failure_stack_ptr)

  #else /* not using relocating allocator */

  # define REGEX_ALLOCATE_STACK(size) REGEX_ALLOCATE (size)
  # define REGEX_REALLOCATE_STACK(source, o, n) REGEX_REALLOCATE (source, o, n)
  # define REGEX_FREE_STACK(ptr) REGEX_FREE (ptr)

  #endif /* not using relocating allocator */

This calls ralloc.c functions directly for allocating/reallocating the
failure stack, when both REL_ALLOC and REGEX_MALLOC are defined.  So
the relocation in question is that of the failure stack, which won't
happen if we call malloc, even if REL_ALLOC is defined, because only
buffer text can be relocated when ralloc.c is called from malloc.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Mon, 02 Jan 2017 04:49:02 GMT) Full text and rfc822 format available.

Message #56 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: npostavs <at> users.sourceforge.net
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Sun, 01 Jan 2017 23:49:46 -0500
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:

>> 
>> /* Define MATCH_MAY_ALLOCATE unless we need to make sure that the
>>    searching and matching functions should not call alloca.  On some
>>    systems, alloca is implemented in terms of malloc, and if we're
>>    using the relocating allocator routines, then malloc could cause a
>>    relocation, which might (if the strings being searched are in the
>>    ralloc heap) shift the data out from underneath the regexp
>>    routines.
>
>
> The first part is not obsolete, but its reasoning is backwards:
> SAFE_ALLOCA indeed can call malloc, but it could only cause relocation
> if REGEX_MALLOC is defined (and ralloc.c is compiled in).  And when
> you define REGEX_MALLOC, MATCH_MAY_ALLOCATE is undefined.  So the text
> there should be revised.

Everything you've said makes sense after your last message, but I'm
still unable to put it together and come up with a revised comment.
Could you make a suggestion?  Here are the rest of the changes.

[v1-0001-Use-expanded-stack-during-regex-matches.patch (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Mon, 02 Jan 2017 15:25:01 GMT) Full text and rfc822 format available.

Message #59 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: npostavs <at> users.sourceforge.net
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Mon, 02 Jan 2017 17:24:26 +0200
> From: npostavs <at> users.sourceforge.net
> Cc: 24751 <at> debbugs.gnu.org
> Date: Sun, 01 Jan 2017 23:49:46 -0500
> 
> Everything you've said makes sense after your last message, but I'm
> still unable to put it together and come up with a revised comment.
> Could you make a suggestion?

How about the below?

--- src/regex.c~0	2016-12-11 06:39:19.000000000 +0200
+++ src/regex.c	2017-01-02 12:40:44.266517100 +0200
@@ -1195,24 +1195,28 @@
     gettext_noop ("Range striding over charsets") /* REG_ERANGEX  */
   };
 
-/* Avoiding alloca during matching, to placate r_alloc.  */
+/* Whether to allocate memory during matching.  */
 
-/* Define MATCH_MAY_ALLOCATE unless we need to make sure that the
-   searching and matching functions should not call alloca.  On some
-   systems, alloca is implemented in terms of malloc, and if we're
-   using the relocating allocator routines, then malloc could cause a
-   relocation, which might (if the strings being searched are in the
-   ralloc heap) shift the data out from underneath the regexp
-   routines.
-
-   Here's another reason to avoid allocation: Emacs
-   processes input from X in a signal handler; processing X input may
-   call malloc; if input arrives while a matching routine is calling
-   malloc, then we're scrod.  But Emacs can't just block input while
-   calling matching routines; then we don't notice interrupts when
-   they come in.  So, Emacs blocks input around all regexp calls
-   except the matching calls, which it leaves unprotected, in the
-   faith that they will not malloc.  */
+/* Define MATCH_MAY_ALLOCATE to allow the searching and matching
+   functions allocate memory for the failure stack and registers.
+   Normally should be defined, because otherwise searching and
+   matching routines will have much smaller memory resources at their
+   disposal, and therefore might fail to handle complex regexps.
+   Therefore undefine MATCH_MAY_ALLOCATE only in the following
+   exceptional situations:
+
+   . When running on a system where memory is at premium.
+   . When alloca cannot be used at all, perhaps due to bugs in
+     its implementation, or its being unavailable, or due to a
+     very small stack size.  This requires to define REGEX_MALLOC
+     to use malloc instead, which in turn could lead to memory
+     leaks if search is interrupted by a signal.  (For these
+     reasons, defining REGEX_MALLOC when building Emacs
+     automatically undefines MATCH_MAY_ALLOCATE, but outside
+     Emacs you may not care about memory leaks.)  If you want to
+     prevent the memory leaks, undefine MATCH_MAY_ALLOCATE.
+   . When code that calls the searching and matching functions
+     cannot allow memory allocation, for whatever reasons.  */
 
 /* Normally, this is fine.  */
 #define MATCH_MAY_ALLOCATE




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Mon, 02 Jan 2017 18:30:03 GMT) Full text and rfc822 format available.

Message #62 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: npostavs <at> users.sourceforge.net
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Mon, 02 Jan 2017 13:30:26 -0500
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: npostavs <at> users.sourceforge.net
>> Cc: 24751 <at> debbugs.gnu.org
>> Date: Sun, 01 Jan 2017 23:49:46 -0500
>> 
>> Everything you've said makes sense after your last message, but I'm
>> still unable to put it together and come up with a revised comment.
>> Could you make a suggestion?
>
> How about the below?

Looks good, I've incorporated it into the patch:

[v3-0001-Fix-computation-of-regex-stack-limit.patch (text/plain, attachment)]
[v3-0002-Use-expanded-stack-during-regex-matches.patch (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Mon, 02 Jan 2017 19:23:01 GMT) Full text and rfc822 format available.

Message #65 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: npostavs <at> users.sourceforge.net
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Mon, 02 Jan 2017 21:22:28 +0200
> From: npostavs <at> users.sourceforge.net
> Cc: 24751 <at> debbugs.gnu.org
> Date: Mon, 02 Jan 2017 13:30:26 -0500
> 
> > How about the below?
> 
> Looks good, I've incorporated it into the patch:

Fine with me, thanks.




Added tag(s) patch. Request was from npostavs <at> users.sourceforge.net to control <at> debbugs.gnu.org. (Mon, 02 Jan 2017 21:59:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#24751; Package emacs. (Sun, 08 Jan 2017 23:49:02 GMT) Full text and rfc822 format available.

Message #70 received at 24751 <at> debbugs.gnu.org (full text, mbox):

From: npostavs <at> users.sourceforge.net
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 24751 <at> debbugs.gnu.org
Subject: Re: bug#24751: 26.0.50;
 Regex stack overflow not detected properly (gets "Variable binding
 depth exceeds max-specpdl-size")
Date: Sun, 08 Jan 2017 18:49:05 -0500
tags 24751 fixed
close 24751 26.1
quit

I've pushed to master.

1: 2017-01-08 18:45:52 -0500 9a19f26cd796c7321f659a8dbea5296b0eeea51d
  Fix computation of regex stack limit

2: 2017-01-08 18:45:52 -0500 13c6f1d185d301aad2f6d756c148acb2edd0889f
  Use expanded stack during regex matches




Added tag(s) fixed. Request was from npostavs <at> users.sourceforge.net to control <at> debbugs.gnu.org. (Sun, 08 Jan 2017 23:49:02 GMT) Full text and rfc822 format available.

bug marked as fixed in version 26.1, send any further explanations to 24751 <at> debbugs.gnu.org and npostavs <at> users.sourceforge.net Request was from npostavs <at> users.sourceforge.net to control <at> debbugs.gnu.org. (Sun, 08 Jan 2017 23:49:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 06 Feb 2017 12:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 8 years and 195 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.