GNU bug report logs - #30789
26.0.91; xml-parse-region works but libxml-parse-html-region doesn't

Previous Next

Package: emacs;

Reported by: Katsumi Yamaoka <yamaoka <at> jpl.org>

Date: Mon, 12 Mar 2018 23:40:02 UTC

Severity: wishlist

Tags: wontfix

Found in version 26.0.91

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Katsumi Yamaoka <yamaoka <at> jpl.org>
Cc: 30789 <at> debbugs.gnu.org, 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
Subject: bug#30789: 26.0.91; xml-parse-region works but libxml-parse-html-region doesn't
Date: Tue, 13 Mar 2018 01:44:22 +0100
Katsumi Yamaoka <yamaoka <at> jpl.org> writes:

> When I read the mail using Gnus + shr, the text after the broken
> point is all cut off.  That is what libxml-parse-html-region does,
> whereas xml-parse-region doesn't cut it.  Moreover a web browser,
> to which I send the html data using the `K H' command, shows all
> the text (the broken character is shown as is, though).
>
> This is not necessarily a libxml bug anyway, but I hope it works
> like xml-parse.

libxml is more strict about correctness of the input than most other
HTML parsers.  I don't think there's anything we can do about this
problematic input other than ponder whether Emacs should use a different
HTML parser, which I think sounds of unlikely.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




This bug report was last modified 7 years and 39 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.