GNU bug report logs - #401
bug in HTML or XML syntax highlighting code

Previous Next

Package: emacs;

Reported by: Paul Pogonyshev <pogonyshev <at> gmx.net>

Date: Thu, 12 Jun 2008 20:20:03 UTC

Severity: minor

Tags: confirmed

Found in versions 24.5, 25.0.94

Done: Tom Tromey <tom <at> tromey.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 401 in the body.
You can then email your comments to 401 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#401; Package emacs. Full text and rfc822 format available.

Acknowledgement sent to Paul Pogonyshev <pogonyshev <at> gmx.net>:
New bug report received and forwarded. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. Full text and rfc822 format available.

Message #5 received at submit <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Paul Pogonyshev <pogonyshev <at> gmx.net>
To: bug-gnu-emacs <at> gnu.org
Subject: bug in HTML or XML syntax highlighting code
Date: Thu, 12 Jun 2008 23:20:04 +0300
Hi,

With fairly recent SVN build of Emacs I have the pasted below HTML code
highlighted wrongly.  Namely, "foo" is not highlighted as an attribute
value, apparently because there are non-corresponding (from Emacs point
of view) parentheses in <script>.  With an almost a year old build at
home, I don't see this bug, so it is a regression in Emacs.

It seems that HTML/XML mode uses two different ways to find syntactic
context for indenting code and for syntax-highlighting it, which I
find bad.  In some larger files I managed to get half a file highlighted
(wrongly!) with `font-lock-string-face', yet code indenting worked
just fine.  So, the same piece of code is considered an attribute value
by highlighting code, but normal tag tree by code indenting code.

Also, the bug seems to be heavily dependent on JIT highlighting.  E.g.
if you remove and then reinsert some of the characters which Emacs
considers parens, code is then rehighlighted correctly.

[originally posted to emacs-devel <at> gnu.org, with attachment instead of
 inlined HTML]

Bug in:

<html>
<head>
  <script>
    function x () { return 1 > 0; }
  </script>
</head>
<body class="foo">
</body>
</html>

Paul





Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#401; Package emacs. Full text and rfc822 format available.

Acknowledgement sent to Stefan Monnier <monnier <at> iro.umontreal.ca>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. Full text and rfc822 format available.

Message #10 received at submit <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Paul Pogonyshev <pogonyshev <at> gmx.net>
Cc: 401 <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
Subject: Re: bug#401: bug in HTML or XML syntax highlighting code
Date: Sat, 14 Jun 2008 15:46:03 -0400
> With fairly recent SVN build of Emacs I have the pasted below HTML code
> highlighted wrongly.  Namely, "foo" is not highlighted as an attribute
> value, apparently because there are non-corresponding (from Emacs point
> of view) parentheses in <script>.

Indeed, part of the problem is that we use sgml-mode for this, even
though your file doesn't seem like a properly formed SGML file.  We need
to add special support for <script>.  Note that we do not properly
support SGML either, e.g. if you use a CDATA[[...]] construct you'll
bump into the same kinds of problems.

> It seems that HTML/XML mode uses two different ways to find syntactic
> context for indenting code and for syntax-highlighting it, which I

Most/all major modes do.  The syntax-highlighting is done "globally"
(especially the comment-vs-string-vs-code distinction), so it can get
seriously messed up over the whole buffer in case the buffer's syntax is
incorrect or is using constructs which the major mode doesn't
understand.  The indentation code usually can work much more locally, so
it tends to be more resilient.


        Stefan





Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#401; Package emacs. Full text and rfc822 format available.

Acknowledgement sent to Stefan Monnier <monnier <at> iro.umontreal.ca>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. Full text and rfc822 format available.

Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#401; Package emacs. Full text and rfc822 format available.

Acknowledgement sent to "Lennart Borgman (gmail)" <lennart.borgman <at> gmail.com>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. Full text and rfc822 format available.

Message #20 received at submit <at> emacsbugs.donarmstrong.com (full text, mbox):

From: "Lennart Borgman (gmail)" <lennart.borgman <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>, 401 <at> debbugs.gnu.org
Cc: Paul Pogonyshev <pogonyshev <at> gmx.net>, bug-gnu-emacs <at> gnu.org
Subject: Re: bug#401: bug in HTML or XML syntax highlighting code
Date: Sat, 14 Jun 2008 22:19:15 +0200
Stefan Monnier wrote:
>> With fairly recent SVN build of Emacs I have the pasted below HTML code
>> highlighted wrongly.  Namely, "foo" is not highlighted as an attribute
>> value, apparently because there are non-corresponding (from Emacs point
>> of view) parentheses in <script>.
> 
> Indeed, part of the problem is that we use sgml-mode for this, even
> though your file doesn't seem like a properly formed SGML file.  We need
> to add special support for <script>.  Note that we do not properly
> support SGML either, e.g. if you use a CDATA[[...]] construct you'll
> bump into the same kinds of problems.

Does not nxml-mode handle this better?

>> It seems that HTML/XML mode uses two different ways to find syntactic
>> context for indenting code and for syntax-highlighting it, which I
> 
> Most/all major modes do.  The syntax-highlighting is done "globally"
> (especially the comment-vs-string-vs-code distinction), so it can get
> seriously messed up over the whole buffer in case the buffer's syntax is
> incorrect or is using constructs which the major mode doesn't
> understand.

I believe the cure to this is some multi major mode handling.





Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#401; Package emacs. Full text and rfc822 format available.

Acknowledgement sent to "Lennart Borgman (gmail)" <lennart.borgman <at> gmail.com>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. Full text and rfc822 format available.

Added tag(s) confirmed. Request was from Noam Postavsky <npostavs <at> users.sourceforge.net> to control <at> debbugs.gnu.org. (Wed, 08 Jun 2016 22:29:01 GMT) Full text and rfc822 format available.

bug Marked as found in versions 24.5. Request was from Noam Postavsky <npostavs <at> users.sourceforge.net> to control <at> debbugs.gnu.org. (Wed, 08 Jun 2016 22:29:01 GMT) Full text and rfc822 format available.

bug Marked as found in versions 25.0.94. Request was from Noam Postavsky <npostavs <at> users.sourceforge.net> to control <at> debbugs.gnu.org. (Wed, 08 Jun 2016 22:29:01 GMT) Full text and rfc822 format available.

Reply sent to Tom Tromey <tom <at> tromey.com>:
You have taken responsibility. (Sun, 21 May 2017 16:15:01 GMT) Full text and rfc822 format available.

Notification sent to Paul Pogonyshev <pogonyshev <at> gmx.net>:
bug acknowledged by developer. (Sun, 21 May 2017 16:15:01 GMT) Full text and rfc822 format available.

Message #36 received at 401-done <at> debbugs.gnu.org (full text, mbox):

From: Tom Tromey <tom <at> tromey.com>
To: 401-done <at> debbugs.gnu.org
Subject: fixed by mhtml-mode
Date: Sun, 21 May 2017 10:14:04 -0600
I tried this test case using mhtml-mode, and it works fine there.
Because mhtml is the default now for HTML files, I think this bug has
been fixed.

thanks,
Tom




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 19 Jun 2017 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 7 years and 361 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.