GNU bug report logs -
#4175
23.1; nxml-mode: Internal error in rng-validate-mode triggered
Previous Next
Full log
Message #47 received at 4175 <at> debbugs.gnu.org (full text, mbox):
The bug is still very much there: I can reproduce it by reducing emacs_re_max_failures from 40000 to 4000. It's just a matter of file size. The failing regexp (used at xmltok.el:735) is, after rx conversion,
(rx (group
(| (group "xmlns")
(: (in "_" alpha)
(* (in "._-" alnum))))
(? (group ":"
(in "_" alpha)
(* (in "._-" alnum)))))
(* (in "\t\n\r "))
"="
(? (* (in "\t\n\r "))
(group
(| (: "'"
(* (not (in "\t\n\r&'<")))
(? (group
(in "\t\n\r&")
(* (not (in "'<")))))
"'")
(: "\""
(* (not (in "\t\n\r\"&<"))) ;;
(? (group ;;
(in "\t\n\r&") ;;
(* (not (in "\"<"))))) ;;
"\"")))
(| (group
(* (in "\t\n\r "))
">")
(: (group
(* (in "\t\n\r "))
"/")
(? (group ">")))
(group
(+ (in "\t\n\r "))))))
and the overflow likely occurs somewhere in the ;;-marked section above, while parsing the big d="..." attribute value. That value isn't huge (55 KiB) and in any case our parser clearly shouldn't need stack space in proportional to an XML attribute value. (The default stack limit fails with attributes around 300 KiB in size, which is not big for an SVG file.) Isolated test case:
(let ((s (concat "'" (make-string 300000 ?a) "'")))
(string-match
(rx "'"
(* (not (in "\t\n\r&'<")))
(? (group
(in "\t\n\r&")
(* (not (in "'<")))))
"'")
s))
I suggest you rewrite the attribute parser so that it doesn't eat regexp stack. For instance,
(rx "'" (* (not (in "'<"))) "'")
doesn't consume stack (thanks to the on_failure_keep_string_jump optimisation). The parser needs to be a little more complex than that and validate entities (the &xyz; things) and detect (and recover from) common errors such as missing end quotes, so a single regexp isn't sufficient.
This bug report was last modified 2 years and 339 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.