GNU bug report logs - #61043
30.0.50; `json-ts-mode': invalid font lock rule

Previous Next

Package: emacs;

Reported by: Mickey Petersen <mickey <at> masteringemacs.org>

Date: Tue, 24 Jan 2023 20:10:02 UTC

Severity: normal

Found in version 30.0.50

Full log


Message #8 received at 61043 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Mickey Petersen <mickey <at> masteringemacs.org>, "61043 <at> debbugs.gnu.org"
 <61043 <at> debbugs.gnu.org>
Subject: RE: [External] : bug#61043: 30.0.50; `json-ts-mode': invalid font
 lock rule
Date: Tue, 24 Jan 2023 21:08:09 +0000
> There's a comment font lock rule in `json-ts-mode'. However, that is
> illegal and againt the JSON spec, and indeed the search query fails
> because `comment' is not a valid node type.

Caveat: Not following this thread, ignorant of
tree sitter, and probably ignorant of the use-
case context.

JSON syntax per its spec(s) is one thing.
JSON out there in the wild is something else.

There are zillions of JSON documents that
aren't well-formed per the specs.  And lots of
apps that create and use such data.

As a result, in the real world, tools that we
expect to be useful for working with real data
need to be _able_ (optionally) to handle at
least the more common such deviations from
what the specs prescribe.

One way to do that is to have a variable/mode
that controls the kind(s) of well-formedness
you want to enforce.  E.g., have two modes:
lax and strict.  Or let functions dealing with
data have an optional arg that specifies the
syntax (lax or strict) to enforce.

And of course, we'd want to document just what
"lax" mode means: what syntax departures from
the specs our lax syntax tolerates.

A lax syntax, for example, often reflects the
JavaScript syntax for object fields; boolean
and null values aren't case-sensitive; and
it's more permissive with respect to numerals,
whitespace, and escaping of Unicode characters
than what the JSON specs require.

E.g., in JavaScript notation, a field name in
an object literal can be, but need not be, in
double quotation marks.  And alternatively it
can be in single quotation marks.

Other things often allowed:

* An extra comma (,) after the last element of
  an array or the last member of an object
  (e.g., [a, b, c,], {a:b, c:d,}).
* Numerals with leading zeros (e.g., 0042.3).
* Fractional numerals that lack 0 before the
  decimal point (e.g., .14 instead of 0.14).
* Numerals with no fractional part after the
  decimal point (e.g., 342. or 1.e27).
* A plus sign (+) preceding a numeral, meaning
  that the number is non-negative (e.g., +1.3).
* Treating all ASCII control chars, and the
  ASCII space character, as (insignificant)
  whitespace chars.

Lax-syntax JSON data is everywhere.  It's good
to have tools that enforce the strict syntax of
the standards, but it's also good to have tools
that tolerate real-world, loosey-goosey JSON.

HTH.




This bug report was last modified 2 years and 144 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.