GNU bug report logs - #13802
stack overflow in mm-add-meta-html-tag

Previous Next

Packages: gnus, emacs;

Reported by: Thien-Thi Nguyen <ttn <at> gnuvola.org>

Date: Sun, 24 Feb 2013 09:18:02 UTC

Severity: normal

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


Message #11 received at 13802 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Thien-Thi Nguyen <ttn <at> gnuvola.org>
Cc: 13802 <at> debbugs.gnu.org
Subject: Re: bug#13802: stack overflow in mm-add-meta-html-tag
Date: Sun, 24 Feb 2013 21:04:21 -0500
> I see a "Stack overflow in regexp matcher" error traceable back to
> lisp/gnus/mm-decode.el func ‘mm-add-meta-html-tag’ fragment:

>   (re-search-forward "\
>   <meta\\s-+http-equiv=[\"']?content-type[\"']?\\s-+content=[\"']\
>   text/\\(\\sw+\\)\\(?:\;\\s-*charset=\\(.+\\)\\)?[\"'][^>]*>" nil t)

Hmm... I don't see any obvious reason for a stack overflow unless the
text has some very long lines or a lot of space between elements.

> One idea (untested) is to replace the ".+" (used to match the charset)
> with a more specific pattern.  Perhaps "[^<>]+" or "\\sw+"?

I don't think that would help.  To avoid such overflow, you need to
reduce the backtracking, i.e. reduce the number of cases where two
options are possible according to the simplistic regexp-optimizer.
\s<CHAR> pattern is actually very poor in this respect, because the
optimizer can't know anything about the chars that this matches (since
it depends on text-properties).
The flip side is that replacing \\s- with [ \t\n] might help (this way,
the optimizer will see that the + repetition does not need backtracking
since a char cannot both match a loop iteration and the "after the
loop" content).
Similarly using [^;'\"]+ instead of \\sw+ would help, and maybe replacing
.+ with [^'\"\n]+ would help as well.

> Thinking more systematically, maybe Emacs should add a condition
> ‘stack-overflow/regexp’ (or something like that) such that code can
> ‘condition-case’ for it and try a fallback path.

In reality, such overflow should only ever happen if you have backrefs
in your regexp.


        Stefan




This bug report was last modified 9 years and 86 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.