GNU bug report logs -
#13802
stack overflow in mm-add-meta-html-tag
Previous Next
Reported by: Thien-Thi Nguyen <ttn <at> gnuvola.org>
Date: Sun, 24 Feb 2013 09:18:02 UTC
Severity: normal
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Full log
Message #11 received at 13802 <at> debbugs.gnu.org (full text, mbox):
> I see a "Stack overflow in regexp matcher" error traceable back to
> lisp/gnus/mm-decode.el func ‘mm-add-meta-html-tag’ fragment:
> (re-search-forward "\
> <meta\\s-+http-equiv=[\"']?content-type[\"']?\\s-+content=[\"']\
> text/\\(\\sw+\\)\\(?:\;\\s-*charset=\\(.+\\)\\)?[\"'][^>]*>" nil t)
Hmm... I don't see any obvious reason for a stack overflow unless the
text has some very long lines or a lot of space between elements.
> One idea (untested) is to replace the ".+" (used to match the charset)
> with a more specific pattern. Perhaps "[^<>]+" or "\\sw+"?
I don't think that would help. To avoid such overflow, you need to
reduce the backtracking, i.e. reduce the number of cases where two
options are possible according to the simplistic regexp-optimizer.
\s<CHAR> pattern is actually very poor in this respect, because the
optimizer can't know anything about the chars that this matches (since
it depends on text-properties).
The flip side is that replacing \\s- with [ \t\n] might help (this way,
the optimizer will see that the + repetition does not need backtracking
since a char cannot both match a loop iteration and the "after the
loop" content).
Similarly using [^;'\"]+ instead of \\sw+ would help, and maybe replacing
.+ with [^'\"\n]+ would help as well.
> Thinking more systematically, maybe Emacs should add a condition
> ‘stack-overflow/regexp’ (or something like that) such that code can
> ‘condition-case’ for it and try a fallback path.
In reality, such overflow should only ever happen if you have backrefs
in your regexp.
Stefan
This bug report was last modified 9 years and 86 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.