#77255 - Treesit font-lock override for embed ranges

GNU bug report logs - #77255
Treesit font-lock override for embed ranges

Package: emacs;

Reported by: Juri Linkov <juri <at> linkov.net>

Date: Tue, 25 Mar 2025 18:30:02 UTC

Severity: normal

Fixed in version 31.0.50

Done: Juri Linkov <juri <at> linkov.net>

Bug is archived. No further changes may be made.

Message #20 received at 77255 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com> To: Juri Linkov <juri <at> linkov.net> Cc: 77255 <at> debbugs.gnu.org, Vincenzo Pupillo <v.pupillo <at> gmail.com> Subject: Re: bug#77255: Treesit font-lock override for embed ranges Date: Sat, 29 Mar 2025 01:23:47 -0700

> On Mar 27, 2025, at 12:04 PM, Juri Linkov <juri <at> linkov.net> wrote: > >>> The commented out code shows attempts to use a negated :match >>> that is not supported. Also it seems a lambda for :pred is >>> also not supported. So needed to add a separate function: >>> >>> #+begin_src emacs-lisp >>> (defun mhtml-ts-mode--not-match (node) >>> (not (string-match-p (rx (or "x-data" "x-bind" "x-text")) >>> (treesit-node-text node t)))) >>> #+end_src >>> >>> Then everything works: all HTML attributes are highlighted >>> except those that should highlight js code in them. >> >> Looks reasonable to me. But if it’s a minor mode, we might need to >> have a way to negate the change made to treesit-font-lock-settings? >> OTOH if we use :override, we might run into an override arm race when >> enabling multiple minor modes, etc. > > We could declare that the last minor mode wins. But indeed still need > a way to restore the original treesit-font-lock-settings after disabling > the minor mode. > > BTW, I found another problem. Please confirm if the range rules allow > only one query per embed language, or I'm doing something wrong? That’s curious, even if you included multiple patterns in the query, it’s still one query; and the range functions support multiple captured ranges when setting up ranges. So something is wrong here. (See treesit-query-range) I can look into this, but give me a few days. > I tried two queries to enable the liquid parser in html nodes 'text' > and also in html attributes 'attribute_value': > > #+begin_src emacs-lisp > (setq-local treesit-range-settings > (append treesit-range-settings > (treesit-range-rules > :embed 'liquid > :host 'html > `(((text) @cap1 > (:match ,(rx (or "{{" "}}" "{%" "%}")) @cap1)) > ((quoted_attribute_value > (attribute_value) @cap2) > (:match ,(rx (or "{{" "}}" "{%" "%}")) @cap2)))))) > #+end_src > > But it handles only one of these queries: when I remove the rule > for (text), it handles attribute_value, but when I remove the rule > for (attribute_value), it enables the liquid parser only for text. > > This revealed another problem. Actually, Liquid is a preprocessor. > Since it can be embedded everywhere in every html node, not depending on > the structure in the html parser, it would be more correct first to use > the liquid parser, and then allow html+js+css parsers to handle > remaining parts. But both liquid and html parsers should apply > on the whole file. The only difference is that liquid has a higher > precedence to decide what overlapping parts belong to the liquid parser. > Or maybe it makes sense to have two primary parsers? They both could > add own highlighting. And in regard to navigation, one of primary > parsers could have a precedence. IMO preprocessor definitely should be the primary parser and let HTML embed in it. In the case of Liquid, it happens to uses a syntax that’s compatible to HTML; that’s fine, but it’s worth it or even necessary to add support for multiple primary parsers because of it. As for precedence, it can be customized by treesit-language-at-point-function. Yuan

This bug report was last modified 91 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #77255 Treesit font-lock override for embed ranges

GNU bug report logs - #77255
Treesit font-lock override for embed ranges