GNU bug report logs - #77255
Treesit font-lock override for embed ranges

Previous Next

Package: emacs;

Reported by: Juri Linkov <juri <at> linkov.net>

Date: Tue, 25 Mar 2025 18:30:02 UTC

Severity: normal

Fixed in version 31.0.50

Done: Juri Linkov <juri <at> linkov.net>

Bug is archived. No further changes may be made.

Full log


Message #23 received at 77255 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Yuan Fu <casouri <at> gmail.com>
Cc: 77255 <at> debbugs.gnu.org, Vincenzo Pupillo <v.pupillo <at> gmail.com>
Subject: Re: bug#77255: Treesit font-lock override for embed ranges
Date: Mon, 31 Mar 2025 19:57:18 +0300
>>> Looks reasonable to me. But if it’s a minor mode, we might need to
>>> have a way to negate the change made to treesit-font-lock-settings?
>>> OTOH if we use :override, we might run into an override arm race when
>>> enabling multiple minor modes, etc.
>> 
>> We could declare that the last minor mode wins.  But indeed still need
>> a way to restore the original treesit-font-lock-settings after disabling
>> the minor mode.

I'm convinced now that minor modes should be avoided since it's not
straightforward to revert the original settings when they are disabled.

Everything works nicely in the attached example of the liquid major mode
where liquid is the primary parser.  Currently it copies settings
from mhtml-ts-mode.  But later I'll try to inherit from mhtml-ts-mode.

>> BTW, I found another problem.  Please confirm if the range rules allow
>> only one query per embed language, or I'm doing something wrong?
>
> That’s curious, even if you included multiple patterns in the query, it’s
> still one query; and the range functions support multiple captured ranges
> when setting up ranges. So something is wrong here. (See
> treesit-query-range) I can look into this, but give me a few days.

Multiple captured ranges are not needed anymore for the attached example.

>> This revealed another problem.  Actually, Liquid is a preprocessor.
>> Since it can be embedded everywhere in every html node, not depending on
>> the structure in the html parser, it would be more correct first to use
>> the liquid parser, and then allow html+js+css parsers to handle
>> remaining parts.  But both liquid and html parsers should apply
>> on the whole file.  The only difference is that liquid has a higher
>> precedence to decide what overlapping parts belong to the liquid parser.
>> Or maybe it makes sense to have two primary parsers?  They both could
>> add own highlighting.  And in regard to navigation, one of primary
>> parsers could have a precedence.
>
> IMO preprocessor definitely should be the primary parser and let HTML embed
> in it. In the case of Liquid, it happens to uses a syntax that’s compatible
> to HTML; that’s fine, but it’s worth it or even necessary to add support
> for multiple primary parsers because of it. As for precedence, it can be
> customized by treesit-language-at-point-function.

Thanks for the suggestion to use the preprocessor as the primary parser.
So multiple primary parsers are not required anymore since other parsers
are embedded to the primary parser ('define-treesit-generic-mode' sets
the primary parser).  And everything works for any embedded level:

liquid -> html -> js -> jsdoc
liquid -> html -> css
liquid -> yaml

#+begin_src emacs-lisp
(define-treesit-generic-mode liquid-generic-ts-mode
  "Tree-sitter generic mode for Liquid templates."
  :lang 'liquid
  :source "https://github.com/hankthetank27/tree-sitter-liquid"
  :mode-remap '(html-mode mhtml-mode html-ts-mode mhtml-ts-mode)
  :name "Liquid"
  ;; TODO: :parent mhtml-ts-mode

  (treesit-parser-create 'html)
  (treesit-parser-create 'css)
  (treesit-parser-create 'javascript)

  (setq-local treesit-range-settings
              (treesit-range-rules
               :embed 'html
               :host 'liquid
               '(((template_content) @cap))

               :embed 'javascript
               :host 'liquid
               '(((js_content) @cap))

               :embed 'css
               :host 'liquid
               '(((style_content) @cap))

               :embed 'javascript
               :host 'html
               '((script_element
                  (start_tag (tag_name))
                  (raw_text) @cap))

               :embed 'css
               :host 'html
               '((style_element
                  (start_tag (tag_name))
                  (raw_text) @cap))))

  (when (treesit-ready-p 'yaml t)
    (treesit-parser-create 'yaml)
    (setq-local treesit-range-settings
                (append treesit-range-settings
                        (treesit-range-rules
                         :embed 'yaml
                         :host 'liquid
                         '(((front_matter) @cap))))))

  (setq-local treesit-font-lock-settings
              (append treesit-font-lock-settings
                      html-ts-mode--font-lock-settings
                      js--treesit-font-lock-settings
                      (treesit-replace-font-lock-feature-settings
                       (treesit-font-lock-rules
                        :language 'css
                        :override t
                        :feature 'variable
                        '((plain_value) @mhtml-ts-mode--colorize-css-value
                          (color_value) @mhtml-ts-mode--colorize-css-value))
                       css--treesit-settings)))

  (setq-local treesit-font-lock-feature-list
              (treesit-merge-font-lock-feature-list
               treesit-font-lock-feature-list
               (treesit-merge-font-lock-feature-list
                html-ts-mode--treesit-font-lock-feature-list
                (treesit-merge-font-lock-feature-list
                 js--treesit-font-lock-feature-list
                 css--treesit-font-lock-feature-list))))

  (when (treesit-ready-p 'jsdoc t)
    (treesit-parser-create 'jsdoc)
    (setq-local treesit-range-settings
                (append treesit-range-settings
                        (treesit-range-rules
                         :embed 'jsdoc
                         :host 'javascript
                         :local t
                         `(((comment) @cap
                            (:match ,js--treesit-jsdoc-beginning-regexp @cap)))))))

  (setq treesit-thing-settings
        (append
         `((liquid (sexp (not ,(rx bos (or "program") eos)))
                   (list ,(rx bos (or "range"
                                      "if_statement"
                                      "for_loop_statement"
                                      "case_statement"
                                      "unless_statement"
                                      "capture_statement"
                                      "form_statement"
                                      "tablerow_statement"
                                      "paginate_statement")
                              eos))))
         mhtml-ts-mode--treesit-thing-settings))

  (setq-local treesit-aggregated-outline-predicate
              `((liquid . ,(rx bos (or "if_statement"
                                       "for_loop_statement")
                               eos))
                (html . ,#'html-ts-mode--outline-predicate)
                (javascript . ,js-ts-mode--outline-predicate)
                (css . ,css-ts-mode--outline-predicate))))
#+end_src




This bug report was last modified 91 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.