GNU bug report logs - #61514
30.0.50; sadistically long xml line hangs emacs

Previous Next

Package: emacs;

Reported by: "Mark A. Hershberger" <mah <at> everybody.org>

Date: Tue, 14 Feb 2023 21:05:02 UTC

Severity: normal

Found in version 30.0.50

Done: Gregory Heytings <gregory <at> heytings.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Gregory Heytings <gregory <at> heytings.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 61514 <at> debbugs.gnu.org, mah <at> everybody.org
Subject: bug#61514: 30.0.50; sadistically long xml line hangs emacs
Date: Mon, 20 Feb 2023 11:47:38 -0500
> I don't know... but I observe that this alone:
>
> (with-current-buffer (get-buffer-create "*bug*")
>   (insert "<id name=\"")
>   (insert (make-string 250000 ?n))
>   (goto-char 5)
>   (looking-at
> "[^<>\n]+?\\(\\(?:\\(xmlns\\)\\|[_[:alpha:]][-._[:alnum:]]*\\)\\(:[_[:alpha:]][-._[:alnum:]]*\\)?\\)[ \r\t\n]*=\\(?:[ \r\t\n]*\\('[^<'&\r\n\t]*\\([&\r\n\t][^<']*\\)?'\\|\"[^<\"&\r\n\t]*\\([&\r\n\t][^<\"]*\\)?\"\\)\\(?:\\([ \r\t\n]*>\\)\\|\\(?:\\([ \r\t\n]*/\\)\\(>\\)?\\)\\|\\([ \r\t\n]+\\)\\)\\)?"))
>
> doesn't fail, so I don't think it's this regexp which causes the overflow.

Indeed, there' still something unclear about how the overflow occurs,
but at least it seems my analysis doesn't match emacs-regex.c's because
I can get a stack overflow using the first part of the regexp:

    (with-current-buffer (get-buffer-create "*bug*")
      (erase-buffer)
      (insert "<id name=\"")
      (insert (make-string 2500000 ?n))
      (goto-char (+ (point-min) 10))
      (looking-at
"\\(\\(?:\\(xmlns\\)\\|[_[:alpha:]][-._[:alnum:]]*\\)\\(:[_[:alpha:]][-._[:alnum:]]*\\)?\\)[ \r\t\n]*="))

where I can even reduce the regexp down to "[-._[:alnum:]]*\t*=".
Looks like we're missing a case in our backtracking-elimination code.


        Stefan





This bug report was last modified 2 years and 147 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.