GNU bug report logs -
#61514
30.0.50; sadistically long xml line hangs emacs
Previous Next
Reported by: "Mark A. Hershberger" <mah <at> everybody.org>
Date: Tue, 14 Feb 2023 21:05:02 UTC
Severity: normal
Found in version 30.0.50
Done: Gregory Heytings <gregory <at> heytings.org>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
> I don't know... but I observe that this alone:
>
> (with-current-buffer (get-buffer-create "*bug*")
> (insert "<id name=\"")
> (insert (make-string 250000 ?n))
> (goto-char 5)
> (looking-at
> "[^<>\n]+?\\(\\(?:\\(xmlns\\)\\|[_[:alpha:]][-._[:alnum:]]*\\)\\(:[_[:alpha:]][-._[:alnum:]]*\\)?\\)[ \r\t\n]*=\\(?:[ \r\t\n]*\\('[^<'&\r\n\t]*\\([&\r\n\t][^<']*\\)?'\\|\"[^<\"&\r\n\t]*\\([&\r\n\t][^<\"]*\\)?\"\\)\\(?:\\([ \r\t\n]*>\\)\\|\\(?:\\([ \r\t\n]*/\\)\\(>\\)?\\)\\|\\([ \r\t\n]+\\)\\)\\)?"))
>
> doesn't fail, so I don't think it's this regexp which causes the overflow.
Indeed, there' still something unclear about how the overflow occurs,
but at least it seems my analysis doesn't match emacs-regex.c's because
I can get a stack overflow using the first part of the regexp:
(with-current-buffer (get-buffer-create "*bug*")
(erase-buffer)
(insert "<id name=\"")
(insert (make-string 2500000 ?n))
(goto-char (+ (point-min) 10))
(looking-at
"\\(\\(?:\\(xmlns\\)\\|[_[:alpha:]][-._[:alnum:]]*\\)\\(:[_[:alpha:]][-._[:alnum:]]*\\)?\\)[ \r\t\n]*="))
where I can even reduce the regexp down to "[-._[:alnum:]]*\t*=".
Looks like we're missing a case in our backtracking-elimination code.
Stefan
This bug report was last modified 2 years and 147 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.