GNU bug report logs - #46764
Extra ">" sails right past XML validator

Previous Next

Package: emacs;

Reported by: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>

Date: Thu, 25 Feb 2021 06:42:01 UTC

Severity: minor

Tags: notabug

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 46764 in the body.
You can then email your comments to 46764 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#46764; Package emacs. (Thu, 25 Feb 2021 06:42:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 25 Feb 2021 06:42:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
To: bug-gnu-emacs <at> gnu.org
Subject: Extra ">" sails right past XML validator
Date: Thu, 25 Feb 2021 07:43:52 +0800
$ cat e.xml
<?xml version="1.0" encoding="utf-8" ?>
<M>></M>
$ emacs e.xml
says at the bottom: (nXML Valid)
$ xmllint e.kml
<?xml version="1.0" encoding="utf-8"?>
<M>&gt;</M>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#46764; Package emacs. (Thu, 25 Feb 2021 15:50:02 GMT) Full text and rfc822 format available.

Message #8 received at 46764 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
Cc: 46764 <at> debbugs.gnu.org
Subject: Re: bug#46764: Extra ">" sails right past XML validator
Date: Thu, 25 Feb 2021 16:48:59 +0100
積丹尼 Dan Jacobson <jidanni <at> jidanni.org> writes:

> $ cat e.xml
> <?xml version="1.0" encoding="utf-8" ?>
> <M>></M>
> $ emacs e.xml
> says at the bottom: (nXML Valid)

I can confirm that this problem still exists in Emacs 28.

It seems to stem from this bit of code:

(defun xmltok-forward ()
  (setq xmltok-start (point))
  (let* ((case-fold-search nil)
	 (space-count (skip-chars-forward " \t\r\n"))
	 (ch (char-after)))
    (cond ((eq ch ?\<)
	   (cond ((> space-count 0)
		  (setq xmltok-type 'space))
		 (t
		  (forward-char 1)
		  (xmltok-scan-after-lt))))
	  ((eq ch ?\&)
	   (cond ((> space-count 0)
		  (setq xmltok-type 'space))
		 (t
		  (forward-char 1)
		  (xmltok-scan-after-amp 'xmltok-handle-entity))))
	  ((re-search-forward "[<&]\\|\\(]]>\\)" nil t)
	   (cond ((not (match-beginning 1))

So (xmltok-forward) on the ">" will just return `data'.  Is it checking
just < and & for validity on purpose?  Anybody remember what the thought
process might have been here?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#46764; Package emacs. (Fri, 26 Feb 2021 09:22:02 GMT) Full text and rfc822 format available.

Message #11 received at 46764 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 46764 <at> debbugs.gnu.org
Subject: Re: bug#46764: Extra ">" sails right past XML validator
Date: Fri, 26 Feb 2021 10:21:44 +0100
">" is not a special character at top level in XML; <M>></M> is well-formed.

I agree that it is an easy mistake to make and overlook. Perhaps an optional warning would be helpful?
Note that nxml-mode is carefully written for correctness and performance, which matter because XML is a lot more complex than people think and files can be large. Any tinkering with it has to be done with prudence.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#46764; Package emacs. (Fri, 26 Feb 2021 09:31:01 GMT) Full text and rfc822 format available.

Message #14 received at 46764 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 46764 <at> debbugs.gnu.org,
 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
Subject: Re: bug#46764: Extra ">" sails right past XML validator
Date: Fri, 26 Feb 2021 10:30:00 +0100
Mattias Engdegård <mattiase <at> acm.org> writes:

> ">" is not a special character at top level in XML; <M>></M> is well-formed.
>
> I agree that it is an easy mistake to make and overlook. Perhaps an
> optional warning would be helpful?

Well, if it is valid (and it is), then I don't really see how adding an
optional warning here would be all that helpful, either -- it seems
kinda beyond the remit of the validator here to teach XML syntax?

So I'm closing this bug report.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) notabug. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Fri, 26 Feb 2021 09:31:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 46764 <at> debbugs.gnu.org and 積丹尼 Dan Jacobson <jidanni <at> jidanni.org> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Fri, 26 Feb 2021 09:31:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#46764; Package emacs. (Fri, 26 Feb 2021 10:29:02 GMT) Full text and rfc822 format available.

Message #21 received at 46764 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 46764 <at> debbugs.gnu.org,
 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
Subject: Re: bug#46764: Extra ">" sails right past XML validator
Date: Fri, 26 Feb 2021 11:28:23 +0100
26 feb. 2021 kl. 10.30 skrev Lars Ingebrigtsen <larsi <at> gnus.org>:

>> ">" is not a special character at top level in XML; <M>></M> is well-formed.
>> 
>> I agree that it is an easy mistake to make and overlook. Perhaps an
>> optional warning would be helpful?
> 
> Well, if it is valid (and it is), then I don't really see how adding an
> optional warning here would be all that helpful, either -- it seems
> kinda beyond the remit of the validator here to teach XML syntax?

It's useful to prevent mistakes, not just following the standard to the letter. Given that the XML file is in an Emacs buffer there is a fair chance that it was hand-written, and then the extra ">" is likely to be unintended, especially since it can be somewhat hard to spot by a human.

Many other modes do similar things. For example, emacs-lisp-mode warns about useless backslashes in strings even though there is no actual syntax error. Think of it as a useful compiler warning.

> So I'm closing this bug report.

That's fine. Dan can reopen if he thinks nxml-mode really needs improvement.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#46764; Package emacs. (Fri, 26 Feb 2021 16:01:02 GMT) Full text and rfc822 format available.

Message #24 received at 46764 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Mattias Engdegård <mattiase <at> acm.org>,
 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>,
 "46764 <at> debbugs.gnu.org" <46764 <at> debbugs.gnu.org>
Subject: RE: [External] : bug#46764: Extra ">" sails right past XML validator
Date: Fri, 26 Feb 2021 16:00:03 +0000
> Note that nxml-mode is carefully written for correctness and
> performance, which matter because XML is a lot more complex than people
> think and files can be large. Any tinkering with it has to be done with
> prudence.

+1

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#46764; Package emacs. (Sat, 27 Feb 2021 10:45:02 GMT) Full text and rfc822 format available.

Message #27 received at 46764 <at> debbugs.gnu.org (full text, mbox):

From: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: Mattias Engdegård <mattiase <at> acm.org>,
 46764 <at> debbugs.gnu.org
Subject: Re: bug#46764: Extra ">" sails right past XML validator
Date: Sat, 27 Feb 2021 18:44:49 +0800
Fine. Better to let the Space Shuttle engines warn about it than catch it
eariler in emacs, valid or not. P.S.,
$ echo '<M>></M>'|xmllint -
<?xml version="1.0"?>
<M>&gt;</M>
So please file a bug against xmllint.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 27 Mar 2021 11:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 143 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.