GNU bug report logs -
#26533
26.0.50; xml-parse-region's symbol-qname argument is ignored
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 26533 in the body.
You can then email your comments to 26533 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#26533
; Package
emacs
.
(Sun, 16 Apr 2017 12:49:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Christopher Wellons <wellons <at> nullprogram.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Sun, 16 Apr 2017 12:49:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
A bug was introduced in aea67018 that causes the special "symbol-qnames"
value for PARSE-NS to be ignored, as if it were nil. This information is
discarded by the change to xml-parse-attlist, so functions further down
the line see the argument as if it was set to nil.
Here's an example of the bug:
(with-temp-buffer
(insert "<root a:b='c'></root>")
(let ((xml-default-ns ()))
(xml-parse-region nil nil nil nil 'symbol-qnames)))
Prior to this commit (Emacs 25.1 and earlier) the result is:
((root ((b . "c"))))
After this commit:
((root ((a:b . "c"))))
This is the same as PARSE-NS being set to nil.
Reply sent
to
David Engster <deng <at> randomsample.de>
:
You have taken responsibility.
(Mon, 17 Apr 2017 15:34:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Christopher Wellons <wellons <at> nullprogram.com>
:
bug acknowledged by developer.
(Mon, 17 Apr 2017 15:34:02 GMT)
Full text and
rfc822 format available.
Message #10 received at 26533-done <at> debbugs.gnu.org (full text, mbox):
Christopher Wellons writes:
> A bug was introduced in aea67018 that causes the special "symbol-qnames"
> value for PARSE-NS to be ignored, as if it were nil. This information is
> discarded by the change to xml-parse-attlist, so functions further down
> the line see the argument as if it was set to nil.
>
> Here's an example of the bug:
>
> (with-temp-buffer
> (insert "<root a:b='c'></root>")
> (let ((xml-default-ns ()))
> (xml-parse-region nil nil nil nil 'symbol-qnames)))
>
> Prior to this commit (Emacs 25.1 and earlier) the result is:
>
> ((root ((b . "c"))))
>
> After this commit:
>
> ((root ((a:b . "c"))))
>
> This is the same as PARSE-NS being set to nil.
Thanks for the report.
You are right that the fix for bug #23440 was not correct. I now pushed
a hopefully better version to master.
Note however that your test above has two problems: First, it's invalid
XML since you're using an undeclared prefix (so the parser should rather
throw an error, but I'm not eager to make the xml parser more strict, as
there's a lot of invalid XML in the wild). Second, I don't understand
why you let-bind `xml-default-ns' to nil. This will break namespace
expansion, and it will actually do this for the whole Emacs session if
xml.el gets autoloaded during the above.
-David
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#26533
; Package
emacs
.
(Mon, 17 Apr 2017 16:30:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 26533-done <at> debbugs.gnu.org (full text, mbox):
Thanks, David! Your fix works fine as far as I can tell.
I'm using this trick in Elfeed (a syndication feed reader) as a fast
method to strip all namespaces from the XML as it's being parsed. As you
said, there's a lot of invalid XML in the wild. I've found it works a
lot better to ignore namespaces and strictness, instead extracting the
required information heuristically as long as it's reasonably close.
Otherwise there would be a whole lot more feeds that wouldn't work well,
or at all, in Elfeed.
I had noticed with symbol-qnames that xml-parse-region drops unknown
namespaces. Since this information comes from an alist, that seemed like
reasonable behavior and I assumed it was intentional -- though signaling
an error would also be reasonable. To tightly control which namespaces
are stripped, I bind xml-default-ns to my own alist for that call. This
feels like the natural and lispy way to use this function.
The file that binds xml-default-ns requires the xml package explicitly,
so there's no risk of it autoloading while it's bound. Though that's an
interesting consequence I hadn't considered before. I _have_ seen
similar issues with accept-process-output when arbitrary process events
are handled while the stack is in an unusual state.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 16 May 2017 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 8 years and 33 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.