GNU bug report logs -
#4175
23.1; nxml-mode: Internal error in rng-validate-mode triggered
Previous Next
To reply to this bug, email your comments to 4175 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4175
; Package
emacs
.
(Mon, 17 Aug 2009 12:05:05 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
karme <karme <at> karme.de>
:
New bug report received and forwarded. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Mon, 17 Aug 2009 12:05:05 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> emacsbugs.donarmstrong.com (full text, mbox):
If I try to validate a SVG with a huge path element in nxml-mode I get
the error: Internal error in rng-validate-mode triggered at buffer
position 616. Stack overflow in regexp matcher.
See also debian bug #541260
<http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=541260>
In GNU Emacs 23.1.1 (x86_64-pc-linux-gnu)
of 2009-08-03 on nautilus, modified by Debian
configured using `configure '--build=x86_64-linux-gnu' '--host=x86_64-linux-gnu' '--prefix=/usr' '--sharedstatedir=/var/lib' '--libexecdir=/usr/lib' '--localstatedir=/var/lib' '--infodir=/usr/share/info' '--mandir=/usr/share/man' '--with-pop=yes' '--enable-locallisppath=/etc/emacs23:/etc/emacs:/usr/local/share/emacs/23.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/23.1/site-lisp:/usr/share/emacs/site-lisp:/usr/share/emacs/23.1/leim' '--with-x=no' 'build_alias=x86_64-linux-gnu' 'host_alias=x86_64-linux-gnu' 'CFLAGS=-DDEBIAN -g -O2' 'LDFLAGS=-g' 'CPPFLAGS=''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: de_DE.UTF-8
value of $XMODIFIERS: nil
locale-coding-system: utf-8-unix
default-enable-multibyte-characters: t
Major mode: nXML
Minor modes in effect:
server-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
global-auto-composition-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t
Recent input:
C ESC O C ESC O C ESC O C ESC O C ESC O C ESC O C ESC
O C ESC O C ESC O C ESC O C ESC O C ESC O C ESC O C
ESC O C ESC O C ESC O C ESC O C ESC O C ESC O C ESC
O C i n SPC n x m l - m DEL DEL DEL DEL DEL DEL DEL
DEL DEL ESC [ 1 ; 5 C ESC O C i n SPC n x m l - m o
d e SPC C-a ESC q ESC O B ESC O B ESC O B ESC O A ESC
O A ESC O A ESC O A ESC O A ESC O A ESC O A ESC O A
ESC O A ESC O A ESC O B ESC O B ESC O B ESC O B ESC
O A C-k C-k C-k C-_ C-_ ESC O B ESC O B ESC O B ESC
O B ESC O B ESC O A C-k C-k C-k ESC O A ESC O D . ESC
O C ESC O B ESC O B C-x 1 ESC O B ESC O B ESC O B ESC
O B ESC O B ESC O B ESC O B ESC O B ESC O B ESC O B
ESC O B ESC O B ESC O B ESC O B ESC O B ESC O B ESC
O B ESC O B ESC O B ESC O B ESC O B ESC O B ESC O A
ESC O A C-x 2 b b C-_ C-_ ESC O A C-_ C-x b n x TAB
DEL DEL t TAB RET ESC x ESC O A RET
Recent messages:
Error during redisplay: (error Stack overflow in regexp matcher)
Making completion list...
byte-code: Command attempted to use minibuffer while in minibuffer
Internal nXML mode error in nxml-extend-after-change-region (Invalid search bound (wrong side of point)), degrading
run-hook-with-args: Wrong type argument: listp, t
Undo! [2 times]
Mark set [2 times]
Auto-saving...done
Undo! [4 times]
Redo!
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4175
; Package
emacs
.
(Sat, 12 Sep 2009 00:20:05 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Chong Yidong <cyd <at> stupidchicken.com>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Sat, 12 Sep 2009 00:20:05 GMT)
Full text and
rfc822 format available.
Message #10 received at 4175 <at> emacsbugs.donarmstrong.com (full text, mbox):
> If I try to validate a SVG with a huge path element in nxml-mode I get
> the error: Internal error in rng-validate-mode triggered at buffer
> position 616. Stack overflow in regexp matcher.
Could you provide a precise, step-by-step recipe for reproducing this
problem, starting with `emacs -Q'?
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4175
; Package
emacs
.
(Mon, 14 Sep 2009 14:50:06 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Jens Thiele <karme <at> berlios.de>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Mon, 14 Sep 2009 14:50:06 GMT)
Full text and
rfc822 format available.
Message #15 received at 4175 <at> emacsbugs.donarmstrong.com (full text, mbox):
[Message part 1 (text/plain, inline)]
Chong Yidong <cyd <at> stupidchicken.com> writes:
>> If I try to validate a SVG with a huge path element in nxml-mode I get
>> the error: Internal error in rng-validate-mode triggered at buffer
>> position 616. Stack overflow in regexp matcher.
>
> Could you provide a precise, step-by-step recipe for reproducing this
> problem, starting with `emacs -Q'?
emacs -Q
M-: (progn
(switch-to-buffer
(find-file
(url-file-local-copy
"http://karme.de/delme/test.svg")))
(nxml-mode)
(switch-to-buffer "*Messages*"))
I also attached the test file. If really needed I can create a stripped
down version.
[test.svg (image/svg+xml, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#4175
; Package
emacs
.
(Fri, 12 Feb 2016 04:25:02 GMT)
Full text and
rfc822 format available.
Message #18 received at 4175 <at> debbugs.gnu.org (full text, mbox):
Jens Thiele <karme <at> berlios.de> writes:
> Chong Yidong <cyd <at> stupidchicken.com> writes:
>
>>> If I try to validate a SVG with a huge path element in nxml-mode I get
>>> the error: Internal error in rng-validate-mode triggered at buffer
>>> position 616. Stack overflow in regexp matcher.
>>
>> Could you provide a precise, step-by-step recipe for reproducing this
>> problem, starting with `emacs -Q'?
>
> emacs -Q
> M-: (progn
> (switch-to-buffer
> (find-file
> (url-file-local-copy
> "http://karme.de/delme/test.svg")))
> (nxml-mode)
> (switch-to-buffer "*Messages*"))
>
> I also attached the test file. If really needed I can create a stripped
> down version.
I've verified this is still a problem in Emacs 25.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#4175
; Package
emacs
.
(Fri, 12 Feb 2016 07:20:01 GMT)
Full text and
rfc822 format available.
Message #21 received at 4175 <at> debbugs.gnu.org (full text, mbox):
> From: Andrew Hyatt <ahyatt <at> gmail.com>
> Date: Thu, 11 Feb 2016 23:23:52 -0500
> Cc: 4175 <at> debbugs.gnu.org
>
> > emacs -Q
> > M-: (progn
> > (switch-to-buffer
> > (find-file
> > (url-file-local-copy
> > "http://karme.de/delme/test.svg")))
> > (nxml-mode)
> > (switch-to-buffer "*Messages*"))
> >
> > I also attached the test file. If really needed I can create a stripped
> > down version.
>
> I've verified this is still a problem in Emacs 25.
It doesn't happen for me, FWIW.
Can you show the offending regexp, and what code in nxml-mode creates
it?
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#4175
; Package
emacs
.
(Fri, 12 Feb 2016 10:13:01 GMT)
Full text and
rfc822 format available.
Message #24 received at 4175 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Fri, 12 Feb 2016 09:19:39 +0200 Eli Zaretskii <eliz <at> gnu.org> wrote:
>> From: Andrew Hyatt <ahyatt <at> gmail.com>
>> Date: Thu, 11 Feb 2016 23:23:52 -0500
>> Cc: 4175 <at> debbugs.gnu.org
>>
>> > emacs -Q
>> > M-: (progn
>> > (switch-to-buffer
>> > (find-file
>> > (url-file-local-copy
>> > "http://karme.de/delme/test.svg")))
>> > (nxml-mode)
>> > (switch-to-buffer "*Messages*"))
>> >
>> > I also attached the test file. If really needed I can create a stripped
>> > down version.
>>
>> I've verified this is still a problem in Emacs 25.
>
> It doesn't happen for me, FWIW.
>
> Can you show the offending regexp, and what code in nxml-mode creates
> it?
I can reproduce it in a91b4b51ddf2575d821adb8b84fdf32cff83886e (GNU
Emacs 25.0.90.2 (x86_64-suse-linux-gnu, GTK+ Version 3.14.15) of
2016-02-11). Here's the backtrace:
[Message part 2 (text/plain, attachment)]
[Message part 3 (text/plain, inline)]
Steve Berman
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#4175
; Package
emacs
.
(Fri, 12 Feb 2016 12:03:01 GMT)
Full text and
rfc822 format available.
Message #27 received at 4175 <at> debbugs.gnu.org (full text, mbox):
> From: Stephen Berman <stephen.berman <at> gmx.net>
> Cc: Andrew Hyatt <ahyatt <at> gmail.com>, 4175 <at> debbugs.gnu.org
> Date: Fri, 12 Feb 2016 11:12:23 +0100
>
> I can reproduce it in a91b4b51ddf2575d821adb8b84fdf32cff83886e (GNU
> Emacs 25.0.90.2 (x86_64-suse-linux-gnu, GTK+ Version 3.14.15) of
> 2016-02-11).
How large is the run-time stack on that system?
> Here's the backtrace:
>
> Debugger entered--Lisp error: (error "Stack overflow in regexp matcher")
> looking-at("\\(\\(?:\\(xmlns\\)\\|[_[:alpha:]][-._[:alnum:]]*\\)\\(:[_[:alpha:]][-._[:alnum:]]*\\)?\\)[
> \n]*=\\(?:[
> \n]*\\('[^<'&
> \n ]*\\([&
> \n ][^<']*\\)?'\\|\"[^<\"&
> \n ]*\\([&
> \n ][^<\"]*\\)?\"\\)\\(?:\\([
> \n]*>\\)\\|\\(?:\\([
> \n]*/\\)\\(>\\)?\\)\\|\\([
> \n]+\\)\\)\\)?")
> xmltok-scan-attributes()
> xmltok-scan-after-lt()
> xmltok-forward()
> rng-forward()
> rng-do-some-validation-1(rng-validate-while-idle-continue-p)
> rng-do-some-validation(rng-validate-while-idle-continue-p)
> rng-validate-while-idle(#<buffer url25099xa>)
Thanks. Perhaps some regexp guru could suggest how to make this
regexp less greedy.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#4175
; Package
emacs
.
(Fri, 12 Feb 2016 12:15:01 GMT)
Full text and
rfc822 format available.
Message #30 received at 4175 <at> debbugs.gnu.org (full text, mbox):
On Fri, 12 Feb 2016 14:02:10 +0200 Eli Zaretskii <eliz <at> gnu.org> wrote:
>> From: Stephen Berman <stephen.berman <at> gmx.net>
>> Cc: Andrew Hyatt <ahyatt <at> gmail.com>, 4175 <at> debbugs.gnu.org
>> Date: Fri, 12 Feb 2016 11:12:23 +0100
>>
>> I can reproduce it in a91b4b51ddf2575d821adb8b84fdf32cff83886e (GNU
>> Emacs 25.0.90.2 (x86_64-suse-linux-gnu, GTK+ Version 3.14.15) of
>> 2016-02-11).
>
> How large is the run-time stack on that system?
ulimit -s says 8192, if that's what you mean.
Steve Berman
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#4175
; Package
emacs
.
(Fri, 12 Feb 2016 15:16:02 GMT)
Full text and
rfc822 format available.
Message #33 received at 4175 <at> debbugs.gnu.org (full text, mbox):
> From: Stephen Berman <stephen.berman <at> gmx.net>
> Cc: ahyatt <at> gmail.com, 4175 <at> debbugs.gnu.org
> Date: Fri, 12 Feb 2016 13:14:41 +0100
>
> > How large is the run-time stack on that system?
>
> ulimit -s says 8192, if that's what you mean.
Same here, but it's a 32-bit build, so maybe that's the reason.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#4175
; Package
emacs
.
(Sun, 10 Jul 2022 02:54:02 GMT)
Full text and
rfc822 format available.
Message #36 received at 4175 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
>> From: Stephen Berman <stephen.berman <at> gmx.net>
>> Cc: Andrew Hyatt <ahyatt <at> gmail.com>, 4175 <at> debbugs.gnu.org
>> Date: Fri, 12 Feb 2016 11:12:23 +0100
>>
>> I can reproduce it in a91b4b51ddf2575d821adb8b84fdf32cff83886e (GNU
>> Emacs 25.0.90.2 (x86_64-suse-linux-gnu, GTK+ Version 3.14.15) of
>> 2016-02-11).
>
> How large is the run-time stack on that system?
>
>> Here's the backtrace:
>>
>> Debugger entered--Lisp error: (error "Stack overflow in regexp matcher")
>> looking-at("\\(\\(?:\\(xmlns\\)\\|[_[:alpha:]][-._[:alnum:]]*\\)\\(:[_[:alpha:]][-._[:alnum:]]*\\)?\\)[
>> \n]*=\\(?:[
>> \n]*\\('[^<'&
>> \n ]*\\([&
>> \n ][^<']*\\)?'\\|\"[^<\"&
>> \n ]*\\([&
>> \n ][^<\"]*\\)?\"\\)\\(?:\\([
>> \n]*>\\)\\|\\(?:\\([
>> \n]*/\\)\\(>\\)?\\)\\|\\([
>> \n]+\\)\\)\\)?")
>> xmltok-scan-attributes()
>> xmltok-scan-after-lt()
>> xmltok-forward()
>> rng-forward()
>> rng-do-some-validation-1(rng-validate-while-idle-continue-p)
>> rng-do-some-validation(rng-validate-while-idle-continue-p)
>> rng-validate-while-idle(#<buffer url25099xa>)
>
> Thanks. Perhaps some regexp guru could suggest how to make this
> regexp less greedy.
Maybe Mattias could take a look?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#4175
; Package
emacs
.
(Sun, 10 Jul 2022 09:25:01 GMT)
Full text and
rfc822 format available.
Message #39 received at 4175 <at> debbugs.gnu.org (full text, mbox):
On Sat, 9 Jul 2022 19:53:42 -0700 Stefan Kangas <stefan <at> marxist.se> wrote:
> Eli Zaretskii <eliz <at> gnu.org> writes:
>
>>> From: Stephen Berman <stephen.berman <at> gmx.net>
>>> Cc: Andrew Hyatt <ahyatt <at> gmail.com>, 4175 <at> debbugs.gnu.org
>>> Date: Fri, 12 Feb 2016 11:12:23 +0100
>>>
>>> I can reproduce it in a91b4b51ddf2575d821adb8b84fdf32cff83886e (GNU
>>> Emacs 25.0.90.2 (x86_64-suse-linux-gnu, GTK+ Version 3.14.15) of
>>> 2016-02-11).
>>
>> How large is the run-time stack on that system?
>>
>>> Here's the backtrace:
>>>
>>> Debugger entered--Lisp error: (error "Stack overflow in regexp matcher")
>>> looking-at("\\(\\(?:\\(xmlns\\)\\|[_[:alpha:]][-._[:alnum:]]*\\)\\(:[_[:alpha:]][-._[:alnum:]]*\\)?\\)[
>>> \n]*=\\(?:[
>>> \n]*\\('[^<'&
>>> \n ]*\\([&
>>> \n ][^<']*\\)?'\\|\"[^<\"&
>>> \n ]*\\([&
>>> \n ][^<\"]*\\)?\"\\)\\(?:\\([
>>> \n]*>\\)\\|\\(?:\\([
>>> \n]*/\\)\\(>\\)?\\)\\|\\([
>>> \n]+\\)\\)\\)?")
>>> xmltok-scan-attributes()
>>> xmltok-scan-after-lt()
>>> xmltok-forward()
>>> rng-forward()
>>> rng-do-some-validation-1(rng-validate-while-idle-continue-p)
>>> rng-do-some-validation(rng-validate-while-idle-continue-p)
>>> rng-validate-while-idle(#<buffer url25099xa>)
>>
>> Thanks. Perhaps some regexp guru could suggest how to make this
>> regexp less greedy.
>
> Maybe Mattias could take a look?
FWIW, I cannot reproduce the error now with -Q in Emacs 27/28/29: in all
cases the mode line of the buffer containing the XML file says "nXML
valid" (both when executing the recipe with the URL, which is still
valid, as well as with the test.svg file provided in the bug thread.)
Steve Berman
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#4175
; Package
emacs
.
(Sun, 10 Jul 2022 10:49:02 GMT)
Full text and
rfc822 format available.
Message #42 received at 4175 <at> debbugs.gnu.org (full text, mbox):
close 4175
thanks
Stephen Berman <stephen.berman <at> gmx.net> writes:
> FWIW, I cannot reproduce the error now with -Q in Emacs 27/28/29: in all
> cases the mode line of the buffer containing the XML file says "nXML
> valid" (both when executing the recipe with the URL, which is still
> valid, as well as with the test.svg file provided in the bug thread.)
OK, let's assume it is fixed. I'm therefore closing this bug report.
If anyone can still reproduce, please reopen.
bug closed, send any further explanations to
4175 <at> debbugs.gnu.org and karme <karme <at> karme.de>
Request was from
Stefan Kangas <stefan <at> marxist.se>
to
control <at> debbugs.gnu.org
.
(Sun, 10 Jul 2022 10:49:03 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#4175
; Package
emacs
.
(Sun, 10 Jul 2022 11:01:02 GMT)
Full text and
rfc822 format available.
Message #47 received at 4175 <at> debbugs.gnu.org (full text, mbox):
The bug is still very much there: I can reproduce it by reducing emacs_re_max_failures from 40000 to 4000. It's just a matter of file size. The failing regexp (used at xmltok.el:735) is, after rx conversion,
(rx (group
(| (group "xmlns")
(: (in "_" alpha)
(* (in "._-" alnum))))
(? (group ":"
(in "_" alpha)
(* (in "._-" alnum)))))
(* (in "\t\n\r "))
"="
(? (* (in "\t\n\r "))
(group
(| (: "'"
(* (not (in "\t\n\r&'<")))
(? (group
(in "\t\n\r&")
(* (not (in "'<")))))
"'")
(: "\""
(* (not (in "\t\n\r\"&<"))) ;;
(? (group ;;
(in "\t\n\r&") ;;
(* (not (in "\"<"))))) ;;
"\"")))
(| (group
(* (in "\t\n\r "))
">")
(: (group
(* (in "\t\n\r "))
"/")
(? (group ">")))
(group
(+ (in "\t\n\r "))))))
and the overflow likely occurs somewhere in the ;;-marked section above, while parsing the big d="..." attribute value. That value isn't huge (55 KiB) and in any case our parser clearly shouldn't need stack space in proportional to an XML attribute value. (The default stack limit fails with attributes around 300 KiB in size, which is not big for an SVG file.) Isolated test case:
(let ((s (concat "'" (make-string 300000 ?a) "'")))
(string-match
(rx "'"
(* (not (in "\t\n\r&'<")))
(? (group
(in "\t\n\r&")
(* (not (in "'<")))))
"'")
s))
I suggest you rewrite the attribute parser so that it doesn't eat regexp stack. For instance,
(rx "'" (* (not (in "'<"))) "'")
doesn't consume stack (thanks to the on_failure_keep_string_jump optimisation). The parser needs to be a little more complex than that and validate entities (the &xyz; things) and detect (and recover from) common errors such as missing end quotes, so a single regexp isn't sufficient.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#4175
; Package
emacs
.
(Sun, 10 Jul 2022 11:02:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#4175
; Package
emacs
.
(Sun, 10 Jul 2022 11:03:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#4175
; Package
emacs
.
(Sun, 10 Jul 2022 13:07:02 GMT)
Full text and
rfc822 format available.
Message #56 received at 4175 <at> debbugs.gnu.org (full text, mbox):
reopen 4175
thanks
Mattias Engdegård <mattiase <at> acm.org> writes:
> The bug is still very much there: I can reproduce it by reducing
> emacs_re_max_failures from 40000 to 4000. It's just a matter of file
> size. The failing regexp (used at xmltok.el:735) is, after rx
> conversion,
Thanks, I'm therefore reopening this bug.
Did not alter fixed versions and reopened.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sun, 10 Jul 2022 13:07:03 GMT)
Full text and
rfc822 format available.
This bug report was last modified 2 years and 338 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.