GNU bug report logs - #4950
`xml-parse-file' returns incorrect results strings after `>' before `<' when CR\LF TAB+

Previous Next

Package: emacs;

Reported by: MON KEY <monkey <at> sandpframing.com>

Date: Tue, 17 Nov 2009 22:20:03 UTC

Severity: normal

Tags: notabug

Done: Chong Yidong <cyd <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Chong Yidong <cyd <at> gnu.org>
To: MON KEY <monkey <at> sandpframing.com>
Cc: 4950 <at> debbugs.gnu.org
Subject: bug#4950: `xml-parse-file' returns incorrect results strings after `>' before `<' when CR\LF TAB+
Date: Sun, 01 Jul 2012 19:22:33 +0800
MON KEY <monkey <at> sandpframing.com> writes:

> <ELEMENT attr1="a1" attr2="a2" attr3="a3" attr4="a4" attr5="a5">CR\LF
> TAB TAB TAB <NEXT-NODE>
>
> Returns (:NOTE with my pp-ing to help clarify the problem):
>
> (ELEMENT nil
>          ((attr1 . "a1")
>           (attr2 . "a2")
>           (attr3 . "a3")
>           (attr4 . "a4")
>           (attr5 . "a5") "
>             " ;; <-i.e. (mapconcat #'char-to-string '(32 10 9 9 9) "")
>           (NEXT-NODE nil (...
>
> Is it if fair/safe to assume that where these types of sequences occur
> they are not part of the XML and can be removed with a regexp?

No.

XML 1.0 Recommendation, Section 2.10 White Space Handling:

"An XML processor MUST always pass all characters in a document that are
not markup through to the application."




This bug report was last modified 13 years and 21 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.