GNU bug report logs -
#27178
libxml-parse-*-region functions discard-comments argument only applies to top level comments
Previous Next
Reported by: Sean McAfee <eefacm <at> gmail.com>
Date: Thu, 1 Jun 2017 00:08:02 UTC
Severity: normal
Tags: confirmed, fixed
Found in versions 26.0.50, 25.2
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Full log
Message #8 received at 27178 <at> debbugs.gnu.org (full text, mbox):
retitle 27178 libxml-parse-*-region functions discard-comments argument only applies to top level comments
found 27178 25.2
tags 27178 confirmed
quit
Sean McAfee <eefacm <at> gmail.com> writes:
> The libxml-parse-html-region and libxml-parse-xml-region functions both
> appear to ignore their discard-comments parameters.
>
> When I enter the following text in a buffer and mark it:
>
> <p>This <!-- and --> that</p>
>
> Then the result of evaluating the expression
>
> (libxml-parse-html-region (mark) (point) nil t)
>
> is
>
> (html nil (body nil (p nil "This " (comment nil " and ") " that")))
>
> and the result of evaluating the expression
>
> (libxml-parse-xml-region (mark) (point) nil t)
>
> is
>
> (p nil "This " (comment nil " and ") " that")
>
> In both cases, I would expect that passing t as the fourth argument
> would cause the comments to be dropped, but they are not.
It doesn't quite ignore that argument, but it only applies to top level
comments. I think it's the implementation level leaking through. See
in xml.c:
static Lisp_Object
parse_region (Lisp_Object start, Lisp_Object end, Lisp_Object base_url,
Lisp_Object discard_comments, bool htmlp)
{
...
/* The document doesn't have toplevel comments or we discarded
them. Get the tree the proper way. */
xmlNode *node = xmlDocGetRootElement (doc);
Apparently the "proper" way already discards top level comments, so the
DISCARD-COMMENTS parameter was added to be able to control this. Maybe
we should just update the docs to match the code though, not sure.
> Incidentally, I notice that the documentation for
> libxml-parse-xml-region includes the following sentence:
>
> If DISCARD-COMMENTS is non-nil, all HTML comments are discarded.
>
> I imagine this ought to refer to "XML comments" rather than "HTML
> comments."
Yeah, looks like copy-pasta from libxml-parse-html-region.
This bug report was last modified 7 years and 118 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.