GNU bug report logs -
#77255
Treesit font-lock override for embed ranges
Previous Next
Reported by: Juri Linkov <juri <at> linkov.net>
Date: Tue, 25 Mar 2025 18:30:02 UTC
Severity: normal
Fixed in version 31.0.50
Done: Juri Linkov <juri <at> linkov.net>
Bug is archived. No further changes may be made.
Full log
Message #20 received at 77255 <at> debbugs.gnu.org (full text, mbox):
> On Mar 27, 2025, at 12:04 PM, Juri Linkov <juri <at> linkov.net> wrote:
>
>>> The commented out code shows attempts to use a negated :match
>>> that is not supported. Also it seems a lambda for :pred is
>>> also not supported. So needed to add a separate function:
>>>
>>> #+begin_src emacs-lisp
>>> (defun mhtml-ts-mode--not-match (node)
>>> (not (string-match-p (rx (or "x-data" "x-bind" "x-text"))
>>> (treesit-node-text node t))))
>>> #+end_src
>>>
>>> Then everything works: all HTML attributes are highlighted
>>> except those that should highlight js code in them.
>>
>> Looks reasonable to me. But if it’s a minor mode, we might need to
>> have a way to negate the change made to treesit-font-lock-settings?
>> OTOH if we use :override, we might run into an override arm race when
>> enabling multiple minor modes, etc.
>
> We could declare that the last minor mode wins. But indeed still need
> a way to restore the original treesit-font-lock-settings after disabling
> the minor mode.
>
> BTW, I found another problem. Please confirm if the range rules allow
> only one query per embed language, or I'm doing something wrong?
That’s curious, even if you included multiple patterns in the query, it’s still one query; and the range functions support multiple captured ranges when setting up ranges. So something is wrong here. (See treesit-query-range) I can look into this, but give me a few days.
> I tried two queries to enable the liquid parser in html nodes 'text'
> and also in html attributes 'attribute_value':
>
> #+begin_src emacs-lisp
> (setq-local treesit-range-settings
> (append treesit-range-settings
> (treesit-range-rules
> :embed 'liquid
> :host 'html
> `(((text) @cap1
> (:match ,(rx (or "{{" "}}" "{%" "%}")) @cap1))
> ((quoted_attribute_value
> (attribute_value) @cap2)
> (:match ,(rx (or "{{" "}}" "{%" "%}")) @cap2))))))
> #+end_src
>
> But it handles only one of these queries: when I remove the rule
> for (text), it handles attribute_value, but when I remove the rule
> for (attribute_value), it enables the liquid parser only for text.
>
> This revealed another problem. Actually, Liquid is a preprocessor.
> Since it can be embedded everywhere in every html node, not depending on
> the structure in the html parser, it would be more correct first to use
> the liquid parser, and then allow html+js+css parsers to handle
> remaining parts. But both liquid and html parsers should apply
> on the whole file. The only difference is that liquid has a higher
> precedence to decide what overlapping parts belong to the liquid parser.
> Or maybe it makes sense to have two primary parsers? They both could
> add own highlighting. And in regard to navigation, one of primary
> parsers could have a precedence.
IMO preprocessor definitely should be the primary parser and let HTML embed in it. In the case of Liquid, it happens to uses a syntax that’s compatible to HTML; that’s fine, but it’s worth it or even necessary to add support for multiple primary parsers because of it. As for precedence, it can be customized by treesit-language-at-point-function.
Yuan
This bug report was last modified 91 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.