GNU bug report logs - #44348
28.0.50; eww renders xml processing element as is

Previous Next

Package: emacs;

Reported by: Pankaj Jangid <pankaj <at> codeisgreat.org>

Date: Sat, 31 Oct 2020 15:48:01 UTC

Severity: normal

Found in version 28.0.50

Done: Stephen Berman <stephen.berman <at> gmx.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 44348 in the body.
You can then email your comments to 44348 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#44348; Package emacs. (Sat, 31 Oct 2020 15:48:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Pankaj Jangid <pankaj <at> codeisgreat.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 31 Oct 2020 15:48:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Pankaj Jangid <pankaj <at> codeisgreat.org>
To: bug-gnu-emacs <at> gnu.org
Subject: 28.0.50; eww renders xml processing element as is
Date: Sat, 31 Oct 2020 21:17:35 +0530
I published a webpage using org. The output has this xml element at the
top:

<?xml version="1.0" encoding="utf-8"?>

But this is rendered as it is in eww when I fetch it from the hosted
website.

When I view-source the element there is:

&lt;?xml version="1.0" encoding="utf-8"?>

Note that the opening angle bracket is converted to HTML entity type.

The hosted page is https://codeisgreat.org/ and source is at
https://github.com/jangid/codeisgreat/blob/master/docs/index.html. If
that helps.


In GNU Emacs 28.0.50 (build 1, x86_64-apple-darwin19.6.0, NS
appkit-1894.60 Version 10.15.7 (Build 19H2))
 of 2020-10-31 built on mb2.local Repository revision:
74c45a62e1e48d7c52dc513b6911e65dcc38aa23 Repository branch: master
Windowing system distributor 'Apple', version 10.3.1894 System
Description: Mac OS X 10.15.7

Configured using:
 'configure LDFLAGS=-L/usr/local/opt/ruby/lib
 CPPFLAGS=-I/usr/local/opt/ruby/include
 PKG_CONFIG_PATH=:/usr/local/opt/sqlite/lib/pkgconfig:/usr/local/opt/libxml2/lib/pkgconfig:/usr/local/opt/openssl/lib/pkgconfig:/usr/local/opt/libffi/lib/pkgconfig:/usr/local/opt/ruby/lib/pkgconfig'

Configured features: JPEG TIFF GIF PNG RSVG DBUS GLIB NOTIFY KQUEUE ACL
GNUTLS LIBXML2 ZLIB TOOLKIT_SCROLL_BARS NS MODULES THREADS JSON PDUMPER
LCMS2

Important settings:
  value of $LC_CTYPE: UTF-8 value of $LANG: en_IN.UTF-8
  locale-coding-system: utf-8-unix

Major mode: eww

Minor modes in effect:
  tooltip-mode: t global-eldoc-mode: t electric-indent-mode: t
  mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t
  file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t
  blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t
  auto-compression-mode: t buffer-read-only: t line-number-mode: t
  transient-mark-mode: t

Load-path shadows: None found.

Features: (shadow sort mail-extr emacsbug message dired dired-loaddefs
rfc822 mml mml-sec epa derived epg epg-config mm-decode mm-bodies
mm-encode mailabbrev gmm-utils mailheader sendmail view mhtml-mode
css-mode smie color js imenu cc-mode cc-fonts cc-guess cc-menus cc-cmds
cc-styles cc-align cc-engine cc-vars cc-defs sgml-mode cl-extra
help-mode gnutls network-stream url-http mail-parse rfc2231 url-gw nsm
rmc url-cache url-auth format-spec eww easymenu xdg url-queue thingatpt
shr kinsoku svg xml dom browse-url url url-proxy url-privacy url-expand
url-methods url-history url-cookie url-domsuf url-util url-parse
url-vars mailcap puny mm-url gnus nnheader gnus-util rmail
rmail-loaddefs auth-source cl-seq eieio eieio-core cl-macs
eieio-loaddefs password-cache json map rfc2047 rfc2045 ietf-drums
text-property-search time-date subr-x seq byte-opt gv bytecomp
byte-compile cconv mail-utils wid-edit mm-util mail-prsvr cl-loaddefs
cl-lib tooltip eldoc electric uniquify ediff-hook vc-hooks
lisp-float-type mwheel term/ns-win ns-win ucs-normalize mule-util
term/common-win tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page tab-bar menu-bar rfn-eshadow isearch timer
select scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame minibuffer cl-generic cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms
cp51932 hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese composite charscript charprop case-table epa-hook
jka-cmpr-hook help simple abbrev obarray cl-preloaded nadvice button
loaddefs faces cus-face macroexp files window text-properties overlay
sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote threads dbusbind kqueue cocoa ns
lcms2 multi-tty make-network-process emacs)

Memory information: ((conses 16 116182 8858)
 (symbols 48 13342 1) (strings 32 39602 2558) (string-bytes 1 1369339)
 (vectors 16 20825) (vector-slots 8 262285 11163) (floats 8 141 51)
 (intervals 56 695 0) (buffers 992 14))

-- 
Pankaj




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44348; Package emacs. (Sat, 31 Oct 2020 19:48:02 GMT) Full text and rfc822 format available.

Message #8 received at 44348 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Berman <stephen.berman <at> gmx.net>
To: Pankaj Jangid <pankaj <at> codeisgreat.org>
Cc: 44348 <at> debbugs.gnu.org
Subject: Re: bug#44348: 28.0.50; eww renders xml processing element as is
Date: Sat, 31 Oct 2020 20:47:09 +0100
On Sat, 31 Oct 2020 21:17:35 +0530 Pankaj Jangid <pankaj <at> codeisgreat.org> wrote:

> I published a webpage using org. The output has this xml element at the
> top:
>
> <?xml version="1.0" encoding="utf-8"?>
>
> But this is rendered as it is in eww when I fetch it from the hosted
> website.
>
> When I view-source the element there is:
>
> &lt;?xml version="1.0" encoding="utf-8"?>
>
> Note that the opening angle bracket is converted to HTML entity type.

The simplest fix would seem to be this:

diff --git a/lisp/net/eww.el b/lisp/net/eww.el
index fd9fe98439..051698d6d6 100644
--- a/lisp/net/eww.el
+++ b/lisp/net/eww.el
@@ -420,7 +420,7 @@ eww--preprocess-html
       (narrow-to-region start end)
       (goto-char start)
       (let ((case-fold-search t))
-        (while (re-search-forward "<[^0-9a-z!/]" nil t)
+        (while (re-search-forward "<[^0-9a-z!?/]" nil t)
           (goto-char (match-beginning 0))
           (delete-region (point) (1+ (point)))
           (insert "&lt;"))))))

But if that's too permissive, then a more specific fix is this:

diff --git a/lisp/net/eww.el b/lisp/net/eww.el
index fd9fe98439..bc795df256 100644
--- a/lisp/net/eww.el
+++ b/lisp/net/eww.el
@@ -421,9 +421,11 @@ eww--preprocess-html
       (goto-char start)
       (let ((case-fold-search t))
         (while (re-search-forward "<[^0-9a-z!/]" nil t)
-          (goto-char (match-beginning 0))
-          (delete-region (point) (1+ (point)))
-          (insert "&lt;"))))))
+          (unless (and (looking-back "\\?" (line-beginning-position))
+                       (looking-at "xml"))
+            (goto-char (match-beginning 0))
+            (delete-region (point) (1+ (point)))
+            (insert "&lt;")))))))

 ;;;###autoload (defalias 'browse-web 'eww)

Steve Berman




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44348; Package emacs. (Sun, 01 Nov 2020 13:29:01 GMT) Full text and rfc822 format available.

Message #11 received at 44348 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Stephen Berman <stephen.berman <at> gmx.net>
Cc: 44348 <at> debbugs.gnu.org, Pankaj Jangid <pankaj <at> codeisgreat.org>
Subject: Re: bug#44348: 28.0.50; eww renders xml processing element as is
Date: Sun, 01 Nov 2020 14:28:46 +0100
Stephen Berman <stephen.berman <at> gmx.net> writes:

> The simplest fix would seem to be this:

[...]

> @@ -420,7 +420,7 @@ eww--preprocess-html
>        (narrow-to-region start end)
>        (goto-char start)
>        (let ((case-fold-search t))
> -        (while (re-search-forward "<[^0-9a-z!/]" nil t)
> +        (while (re-search-forward "<[^0-9a-z!?/]" nil t)

Looks good to me; go ahead and push.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44348; Package emacs. (Sun, 01 Nov 2020 23:09:01 GMT) Full text and rfc822 format available.

Message #14 received at 44348 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Berman <stephen.berman <at> gmx.net>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 44348 <at> debbugs.gnu.org, Pankaj Jangid <pankaj <at> codeisgreat.org>
Subject: Re: bug#44348: 28.0.50; eww renders xml processing element as is
Date: Mon, 02 Nov 2020 00:08:43 +0100
On Sun, 01 Nov 2020 14:28:46 +0100 Lars Ingebrigtsen <larsi <at> gnus.org> wrote:

> Stephen Berman <stephen.berman <at> gmx.net> writes:
>
>> The simplest fix would seem to be this:
>
> [...]
>
>> @@ -420,7 +420,7 @@ eww--preprocess-html
>>        (narrow-to-region start end)
>>        (goto-char start)
>>        (let ((case-fold-search t))
>> -        (while (re-search-forward "<[^0-9a-z!/]" nil t)
>> +        (while (re-search-forward "<[^0-9a-z!?/]" nil t)
>
> Looks good to me; go ahead and push.

Thanks.  I checked and saw that eww--preprocess-html is a new function
in emacs-27 (commit 568f1488), and the bug does not happen in emacs-26,
so should the fix go into the emacs-27 branch?

Steve Berman




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44348; Package emacs. (Mon, 02 Nov 2020 15:17:01 GMT) Full text and rfc822 format available.

Message #17 received at 44348 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Stephen Berman <stephen.berman <at> gmx.net>
Cc: 44348 <at> debbugs.gnu.org, Pankaj Jangid <pankaj <at> codeisgreat.org>
Subject: Re: bug#44348: 28.0.50; eww renders xml processing element as is
Date: Mon, 02 Nov 2020 16:16:40 +0100
Stephen Berman <stephen.berman <at> gmx.net> writes:

> Thanks.  I checked and saw that eww--preprocess-html is a new function
> in emacs-27 (commit 568f1488), and the bug does not happen in emacs-26,
> so should the fix go into the emacs-27 branch?

Yes, this should be a safe enough fix to go to emacs-27.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Reply sent to Stephen Berman <stephen.berman <at> gmx.net>:
You have taken responsibility. (Mon, 02 Nov 2020 22:29:02 GMT) Full text and rfc822 format available.

Notification sent to Pankaj Jangid <pankaj <at> codeisgreat.org>:
bug acknowledged by developer. (Mon, 02 Nov 2020 22:29:02 GMT) Full text and rfc822 format available.

Message #22 received at 44348-done <at> debbugs.gnu.org (full text, mbox):

From: Stephen Berman <stephen.berman <at> gmx.net>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 44348-done <at> debbugs.gnu.org, Pankaj Jangid <pankaj <at> codeisgreat.org>
Subject: Re: bug#44348: 28.0.50; eww renders xml processing element as is
Date: Mon, 02 Nov 2020 23:28:09 +0100
On Mon, 02 Nov 2020 16:16:40 +0100 Lars Ingebrigtsen <larsi <at> gnus.org> wrote:

> Stephen Berman <stephen.berman <at> gmx.net> writes:
>
>> Thanks.  I checked and saw that eww--preprocess-html is a new function
>> in emacs-27 (commit 568f1488), and the bug does not happen in emacs-26,
>> so should the fix go into the emacs-27 branch?
>
> Yes, this should be a safe enough fix to go to emacs-27.

Done in commit 1b7ab9d0ac and closing the bug.

Steve Berman




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 01 Dec 2020 12:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 256 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.