GNU bug report logs - #61514
30.0.50; sadistically long xml line hangs emacs

Previous Next

Package: emacs;

Reported by: "Mark A. Hershberger" <mah <at> everybody.org>

Date: Tue, 14 Feb 2023 21:05:02 UTC

Severity: normal

Found in version 30.0.50

Done: Gregory Heytings <gregory <at> heytings.org>

Bug is archived. No further changes may be made.

Full log


Message #215 received at 61514 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Gregory Heytings <gregory <at> heytings.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 61514 <at> debbugs.gnu.org, mah <at> everybody.org
Subject: Re: bug#61514: 30.0.50; sadistically long xml line hangs emacs
Date: Tue, 21 Feb 2023 11:58:02 -0500
>> OK, thanks.  Stefan, do you have any further comments/objections on
>> that version?

LGTM.

> By the way, I noted that a variant of the regexp still produces stack
>  overflows:
>
> (with-current-buffer (get-buffer-create "*bug*")
>   (erase-buffer)
>   (insert (make-string 266665 ?x) "=")
>   (goto-char (point-min))
>   (looking-at "[^y]*=*"))
>
> 266665 overflows, 266664 does not.  Is that expected?

Yes, there's "nothing" we can do about it (short of a significant
redesign of the engine): [^y] also matches = so at every iteration of
the loop, both paths (perform one more iteration, or exit the loop) are
valid, so we need to try them both, which we do via backtracking.

We'd need a "Thompson NFA" or something along the same lines to avoid
it.

Of course, we could also just backtrack less deep by exploring the
search space in a different order (e.g. the `*?` repetition does that),
but if we want to still return the same end result, we'd then have to
explore more of the search space (and after the fact, choose which
match we should return) rather than stop at the first match.


        Stefan





This bug report was last modified 2 years and 147 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.