GNU bug report logs - #61043
30.0.50; `json-ts-mode': invalid font lock rule

Previous Next

Package: emacs;

Reported by: Mickey Petersen <mickey <at> masteringemacs.org>

Date: Tue, 24 Jan 2023 20:10:02 UTC

Severity: normal

Found in version 30.0.50

To reply to this bug, email your comments to 61043 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#61043; Package emacs. (Tue, 24 Jan 2023 20:10:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Mickey Petersen <mickey <at> masteringemacs.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 24 Jan 2023 20:10:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Mickey Petersen <mickey <at> masteringemacs.org>
To: bug-gnu-emacs <at> gnu.org
Subject: 30.0.50; `json-ts-mode': invalid font lock rule
Date: Tue, 24 Jan 2023 20:09:40 +0000
There's a comment font lock rule in `json-ts-mode'. However, that is
illegal and againt the JSON spec, and indeed the search query fails
because `comment' is not a valid node type.


In GNU Emacs 30.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version
 3.24.20, cairo version 1.16.0) of 2023-01-17 built on mickey-work
Repository revision: bb383a54910c3094e5d228e0af62bf70e36203ca
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12013000
System Description: Ubuntu 20.04.3 LTS

Configured using:
 'configure --with-native-compilation --with-json --with-mailutils
 --without-compress-install --with-imagemagick CC=gcc-10'

Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ
IMAGEMAGICK JPEG JSON LCMS2 LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2
M17N_FLT MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP
SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER X11 XDBE
XIM XINPUT2 XPM GTK3 ZLIB





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61043; Package emacs. (Tue, 24 Jan 2023 21:09:02 GMT) Full text and rfc822 format available.

Message #8 received at 61043 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Mickey Petersen <mickey <at> masteringemacs.org>, "61043 <at> debbugs.gnu.org"
 <61043 <at> debbugs.gnu.org>
Subject: RE: [External] : bug#61043: 30.0.50; `json-ts-mode': invalid font
 lock rule
Date: Tue, 24 Jan 2023 21:08:09 +0000
> There's a comment font lock rule in `json-ts-mode'. However, that is
> illegal and againt the JSON spec, and indeed the search query fails
> because `comment' is not a valid node type.

Caveat: Not following this thread, ignorant of
tree sitter, and probably ignorant of the use-
case context.

JSON syntax per its spec(s) is one thing.
JSON out there in the wild is something else.

There are zillions of JSON documents that
aren't well-formed per the specs.  And lots of
apps that create and use such data.

As a result, in the real world, tools that we
expect to be useful for working with real data
need to be _able_ (optionally) to handle at
least the more common such deviations from
what the specs prescribe.

One way to do that is to have a variable/mode
that controls the kind(s) of well-formedness
you want to enforce.  E.g., have two modes:
lax and strict.  Or let functions dealing with
data have an optional arg that specifies the
syntax (lax or strict) to enforce.

And of course, we'd want to document just what
"lax" mode means: what syntax departures from
the specs our lax syntax tolerates.

A lax syntax, for example, often reflects the
JavaScript syntax for object fields; boolean
and null values aren't case-sensitive; and
it's more permissive with respect to numerals,
whitespace, and escaping of Unicode characters
than what the JSON specs require.

E.g., in JavaScript notation, a field name in
an object literal can be, but need not be, in
double quotation marks.  And alternatively it
can be in single quotation marks.

Other things often allowed:

* An extra comma (,) after the last element of
  an array or the last member of an object
  (e.g., [a, b, c,], {a:b, c:d,}).
* Numerals with leading zeros (e.g., 0042.3).
* Fractional numerals that lack 0 before the
  decimal point (e.g., .14 instead of 0.14).
* Numerals with no fractional part after the
  decimal point (e.g., 342. or 1.e27).
* A plus sign (+) preceding a numeral, meaning
  that the number is non-negative (e.g., +1.3).
* Treating all ASCII control chars, and the
  ASCII space character, as (insignificant)
  whitespace chars.

Lax-syntax JSON data is everywhere.  It's good
to have tools that enforce the strict syntax of
the standards, but it's also good to have tools
that tolerate real-world, loosey-goosey JSON.

HTH.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61043; Package emacs. (Tue, 24 Jan 2023 21:11:02 GMT) Full text and rfc822 format available.

Message #11 received at 61043 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Drew Adams <drew.adams <at> oracle.com>, Mickey Petersen
 <mickey <at> masteringemacs.org>,
 "61043 <at> debbugs.gnu.org" <61043 <at> debbugs.gnu.org>
Subject: RE: [External] : bug#61043: 30.0.50; `json-ts-mode': invalid font
 lock rule
Date: Tue, 24 Jan 2023 21:10:45 +0000
Meant to mention comments. ;-)  There are some
JSON comment syntaxes out there.  Whether our
"lax" syntax, if we have one, supports any is
a choice.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61043; Package emacs. (Wed, 25 Jan 2023 01:29:01 GMT) Full text and rfc822 format available.

Message #14 received at 61043 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Mickey Petersen <mickey <at> masteringemacs.org>, 61043 <at> debbugs.gnu.org
Subject: Re: bug#61043: 30.0.50; `json-ts-mode': invalid font lock rule
Date: Wed, 25 Jan 2023 03:28:00 +0200
On 24/01/2023 22:09, Mickey Petersen wrote:
> There's a comment font lock rule in `json-ts-mode'. However, that is
> illegal and againt the JSON spec, and indeed the search query fails
> because `comment' is not a valid node type.

When you say it fails, how does that look to you?

Here's an example of a JSON file (or, more accurately, a JSON-superset 
file) with comments: 
https://raw.githubusercontent.com/huytd/vscode-espresso-tutti/master/themes/Espresso%20Tutti-color-theme.json

The JSON tree-sitter grammar seems to parse them correctly as comments 
("comment" node type), and json-ts-mode highlights them as comments 
correctly as a result.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61043; Package emacs. (Wed, 25 Jan 2023 07:30:02 GMT) Full text and rfc822 format available.

Message #17 received at 61043 <at> debbugs.gnu.org (full text, mbox):

From: Mickey Petersen <mickey <at> masteringemacs.org>
To: Drew Adams <drew.adams <at> oracle.com>
Cc: "61043 <at> debbugs.gnu.org" <61043 <at> debbugs.gnu.org>
Subject: Re: [External] : bug#61043: 30.0.50; `json-ts-mode': invalid font
 lock rule
Date: Wed, 25 Jan 2023 07:27:46 +0000
Drew Adams <drew.adams <at> oracle.com> writes:

>> There's a comment font lock rule in `json-ts-mode'. However, that is
>> illegal and againt the JSON spec, and indeed the search query fails
>> because `comment' is not a valid node type.
>
> Caveat: Not following this thread, ignorant of
> tree sitter, and probably ignorant of the use-
> case context.
>
> JSON syntax per its spec(s) is one thing.
> JSON out there in the wild is something else.
>
> There are zillions of JSON documents that
> aren't well-formed per the specs.  And lots of
> apps that create and use such data.
>

Agreed. Sadly, the JSON grammar in TS does not support comments, and
so no comment support is possible. OTOH, because it is strict, it'll
catch errors like trailing commas, etc.

For 'lax' JSON, the best option is Javascript.

> As a result, in the real world, tools that we
> expect to be useful for working with real data
> need to be _able_ (optionally) to handle at
> least the more common such deviations from
> what the specs prescribe.
>
> One way to do that is to have a variable/mode
> that controls the kind(s) of well-formedness
> you want to enforce.  E.g., have two modes:
> lax and strict.  Or let functions dealing with
> data have an optional arg that specifies the
> syntax (lax or strict) to enforce.
>
> And of course, we'd want to document just what
> "lax" mode means: what syntax departures from
> the specs our lax syntax tolerates.
>
> A lax syntax, for example, often reflects the
> JavaScript syntax for object fields; boolean
> and null values aren't case-sensitive; and
> it's more permissive with respect to numerals,
> whitespace, and escaping of Unicode characters
> than what the JSON specs require.
>
> E.g., in JavaScript notation, a field name in
> an object literal can be, but need not be, in
> double quotation marks.  And alternatively it
> can be in single quotation marks.
>
> Other things often allowed:
>
> * An extra comma (,) after the last element of
>   an array or the last member of an object
>   (e.g., [a, b, c,], {a:b, c:d,}).
> * Numerals with leading zeros (e.g., 0042.3).
> * Fractional numerals that lack 0 before the
>   decimal point (e.g., .14 instead of 0.14).
> * Numerals with no fractional part after the
>   decimal point (e.g., 342. or 1.e27).
> * A plus sign (+) preceding a numeral, meaning
>   that the number is non-negative (e.g., +1.3).
> * Treating all ASCII control chars, and the
>   ASCII space character, as (insignificant)
>   whitespace chars.
>
> Lax-syntax JSON data is everywhere.  It's good
> to have tools that enforce the strict syntax of
> the standards, but it's also good to have tools
> that tolerate real-world, loosey-goosey JSON.
>
> HTH.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61043; Package emacs. (Wed, 25 Jan 2023 07:31:02 GMT) Full text and rfc822 format available.

Message #20 received at 61043 <at> debbugs.gnu.org (full text, mbox):

From: Mickey Petersen <mickey <at> masteringemacs.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 61043 <at> debbugs.gnu.org
Subject: Re: bug#61043: 30.0.50; `json-ts-mode': invalid font lock rule
Date: Wed, 25 Jan 2023 07:29:23 +0000
Dmitry Gutov <dgutov <at> yandex.ru> writes:

> On 24/01/2023 22:09, Mickey Petersen wrote:
>> There's a comment font lock rule in `json-ts-mode'. However, that is
>> illegal and againt the JSON spec, and indeed the search query fails
>> because `comment' is not a valid node type.
>
> When you say it fails, how does that look to you?
>
> Here's an example of a JSON file (or, more accurately, a JSON-superset
> file) with comments:
> https://raw.githubusercontent.com/huytd/vscode-espresso-tutti/master/themes/Espresso%20Tutti-color-theme.json
>
> The JSON tree-sitter grammar seems to parse them correctly as comments
> ("comment" node type), and json-ts-mode highlights them as comments
> correctly as a result.

It may well be my JSON grammar file that is different then. Which is
perhaps even worse: it is easy to find yourself with one of two
versions.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61043; Package emacs. (Wed, 25 Jan 2023 12:09:02 GMT) Full text and rfc822 format available.

Message #23 received at 61043 <at> debbugs.gnu.org (full text, mbox):

From: Theodor Thornhill <theo <at> thornhill.no>
To: Mickey Petersen <mickey <at> masteringemacs.org>
Cc: 61043 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#61043: 30.0.50; `json-ts-mode': invalid font lock rule
Date: Wed, 25 Jan 2023 13:08:31 +0100
Mickey Petersen <mickey <at> masteringemacs.org> writes:

> Dmitry Gutov <dgutov <at> yandex.ru> writes:
>
>> On 24/01/2023 22:09, Mickey Petersen wrote:
>>> There's a comment font lock rule in `json-ts-mode'. However, that is
>>> illegal and againt the JSON spec, and indeed the search query fails
>>> because `comment' is not a valid node type.
>>
>> When you say it fails, how does that look to you?
>>
>> Here's an example of a JSON file (or, more accurately, a JSON-superset
>> file) with comments:
>> https://raw.githubusercontent.com/huytd/vscode-espresso-tutti/master/themes/Espresso%20Tutti-color-theme.json
>>
>> The JSON tree-sitter grammar seems to parse them correctly as comments
>> ("comment" node type), and json-ts-mode highlights them as comments
>> correctly as a result.
>
> It may well be my JSON grammar file that is different then. Which is
> perhaps even worse: it is easy to find yourself with one of two
> versions.


See [0], it seems comment is supported if I'm not mistaken.

Theo

[0]: https://github.com/tree-sitter/tree-sitter-json/blob/master/grammar.js#L6




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61043; Package emacs. (Wed, 25 Jan 2023 12:11:02 GMT) Full text and rfc822 format available.

Message #26 received at 61043 <at> debbugs.gnu.org (full text, mbox):

From: Mickey Petersen <mickey <at> masteringemacs.org>
To: Theodor Thornhill <theo <at> thornhill.no>
Cc: 61043 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#61043: 30.0.50; `json-ts-mode': invalid font lock rule
Date: Wed, 25 Jan 2023 12:09:08 +0000
Theodor Thornhill <theo <at> thornhill.no> writes:

> Mickey Petersen <mickey <at> masteringemacs.org> writes:
>
>> Dmitry Gutov <dgutov <at> yandex.ru> writes:
>>
>>> On 24/01/2023 22:09, Mickey Petersen wrote:
>>>> There's a comment font lock rule in `json-ts-mode'. However, that is
>>>> illegal and againt the JSON spec, and indeed the search query fails
>>>> because `comment' is not a valid node type.
>>>
>>> When you say it fails, how does that look to you?
>>>
>>> Here's an example of a JSON file (or, more accurately, a JSON-superset
>>> file) with comments:
>>> https://raw.githubusercontent.com/huytd/vscode-espresso-tutti/master/themes/Espresso%20Tutti-color-theme.json
>>>
>>> The JSON tree-sitter grammar seems to parse them correctly as comments
>>> ("comment" node type), and json-ts-mode highlights them as comments
>>> correctly as a result.
>>
>> It may well be my JSON grammar file that is different then. Which is
>> perhaps even worse: it is easy to find yourself with one of two
>> versions.
>
>
> See [0], it seems comment is supported if I'm not mistaken.
>
> Theo
>
> [0]: https://github.com/tree-sitter/tree-sitter-json/blob/master/grammar.js#L6

I understand. But nevertheless, I do get an error for that rule as it's missing (for some inexplicable reason.)

It would be better if the font lock machinery disables/ignores the rule if it encounters a validation error. That way it'll gracefully degrade.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61043; Package emacs. (Wed, 25 Jan 2023 13:01:02 GMT) Full text and rfc822 format available.

Message #29 received at 61043 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mickey Petersen <mickey <at> masteringemacs.org>
Cc: 61043 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#61043: 30.0.50; `json-ts-mode': invalid font lock rule
Date: Wed, 25 Jan 2023 15:00:16 +0200
> Cc: 61043 <at> debbugs.gnu.org
> From: Mickey Petersen <mickey <at> masteringemacs.org>
> Date: Wed, 25 Jan 2023 07:29:23 +0000
> 
> 
> Dmitry Gutov <dgutov <at> yandex.ru> writes:
> 
> > On 24/01/2023 22:09, Mickey Petersen wrote:
> >> There's a comment font lock rule in `json-ts-mode'. However, that is
> >> illegal and againt the JSON spec, and indeed the search query fails
> >> because `comment' is not a valid node type.
> >
> > When you say it fails, how does that look to you?
> >
> > Here's an example of a JSON file (or, more accurately, a JSON-superset
> > file) with comments:
> > https://raw.githubusercontent.com/huytd/vscode-espresso-tutti/master/themes/Espresso%20Tutti-color-theme.json
> >
> > The JSON tree-sitter grammar seems to parse them correctly as comments
> > ("comment" node type), and json-ts-mode highlights them as comments
> > correctly as a result.
> 
> It may well be my JSON grammar file that is different then. Which is
> perhaps even worse: it is easy to find yourself with one of two
> versions.

For best results, always use the latest from their Git repository.
Many of the grammar libraries are updated every few days, so they are
not stable enough to rely on outdated versions.  Unfortunately,
there's no "grammar version" API in the tree-sitter-to-grammar
protocol, so we cannot even implement version checking, and refuse to
use outdated (and thus buggy) grammar libraries.  Moreover, many
grammar libraries don't even make releases and thus don't announce
their version.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61043; Package emacs. (Wed, 25 Jan 2023 13:16:02 GMT) Full text and rfc822 format available.

Message #32 received at 61043 <at> debbugs.gnu.org (full text, mbox):

From: Theodor Thornhill <theo <at> thornhill.no>
To: Mickey Petersen <mickey <at> masteringemacs.org>
Cc: 61043 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#61043: 30.0.50; `json-ts-mode': invalid font lock rule
Date: Wed, 25 Jan 2023 14:14:11 +0100

On 25 January 2023 13:09:08 CET, Mickey Petersen <mickey <at> masteringemacs.org> wrote:
>
>Theodor Thornhill <theo <at> thornhill.no> writes:
>
>> Mickey Petersen <mickey <at> masteringemacs.org> writes:
>>
>>> Dmitry Gutov <dgutov <at> yandex.ru> writes:
>>>
>>>> On 24/01/2023 22:09, Mickey Petersen wrote:
>>>>> There's a comment font lock rule in `json-ts-mode'. However, that is
>>>>> illegal and againt the JSON spec, and indeed the search query fails
>>>>> because `comment' is not a valid node type.
>>>>
>>>> When you say it fails, how does that look to you?
>>>>
>>>> Here's an example of a JSON file (or, more accurately, a JSON-superset
>>>> file) with comments:
>>>> https://raw.githubusercontent.com/huytd/vscode-espresso-tutti/master/themes/Espresso%20Tutti-color-theme.json
>>>>
>>>> The JSON tree-sitter grammar seems to parse them correctly as comments
>>>> ("comment" node type), and json-ts-mode highlights them as comments
>>>> correctly as a result.
>>>
>>> It may well be my JSON grammar file that is different then. Which is
>>> perhaps even worse: it is easy to find yourself with one of two
>>> versions.
>>
>>
>> See [0], it seems comment is supported if I'm not mistaken.
>>
>> Theo
>>
>> [0]: https://github.com/tree-sitter/tree-sitter-json/blob/master/grammar.js#L6
>
>I understand. But nevertheless, I do get an error for that rule as it's missing (for some inexplicable reason.)
>
>It would be better if the font lock machinery disables/ignores the rule if it encounters a validation error. That way it'll gracefully degrade.
>

Yeah but this touches a deeper point, imo.

There's no good way to version this. Perhaps when we stop committing treesit stuff to emacs-29 we create a list of verified git commit hashes that are supported by the mode? That way we at least have _some_ info.

What do you think?

Theo




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61043; Package emacs. (Wed, 25 Jan 2023 13:31:01 GMT) Full text and rfc822 format available.

Message #35 received at 61043 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Mickey Petersen <mickey <at> masteringemacs.org>,
 Theodor Thornhill <theo <at> thornhill.no>
Cc: 61043 <at> debbugs.gnu.org
Subject: Re: bug#61043: 30.0.50; `json-ts-mode': invalid font lock rule
Date: Wed, 25 Jan 2023 15:30:27 +0200
On 25/01/2023 14:09, Mickey Petersen wrote:
> I understand. But nevertheless, I do get an error for that rule as it's missing (for some inexplicable reason.)

Have you tried installing the latest version?

'M-x treesit-install-language-grammar' can help.




This bug report was last modified 2 years and 143 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.