Package: emacs;
Reported by: Juri Linkov <juri <at> linkov.net>
Date: Tue, 25 Mar 2025 18:30:02 UTC
Severity: normal
Fixed in version 31.0.50
Done: Juri Linkov <juri <at> linkov.net>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 77255 in the body.
You can then email your comments to 77255 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
casouri <at> gmail.com, v.pupillo <at> gmail.com, bug-gnu-emacs <at> gnu.org
:bug#77255
; Package emacs
.
(Tue, 25 Mar 2025 18:30:02 GMT) Full text and rfc822 format available.Juri Linkov <juri <at> linkov.net>
:casouri <at> gmail.com, v.pupillo <at> gmail.com, bug-gnu-emacs <at> gnu.org
.
(Tue, 25 Mar 2025 18:30:02 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Juri Linkov <juri <at> linkov.net> To: bug-gnu-emacs <at> gnu.org Subject: Treesit font-lock override for embed ranges Date: Tue, 25 Mar 2025 20:25:17 +0200
It looks like we need a new keyword like ':override t' for 'treesit-range-rules' that would override host font-lock rules. I'm trying to create a generic minor mode for AlpineJS framework where some known HTML attributes contain JS code. For example: <div x-data="{ open: false }"> <div x-bind:class="! open ? 'hidden' : ''"> <span x-text="new Date().getFullYear()"> This works nicely with this code added for testing to mhtml-ts-mode: #+begin_src emacs-lisp (setq-local treesit-range-settings (append treesit-range-settings (treesit-range-rules :embed 'javascript :host 'html :local t `((attribute (attribute_name) @_name (:match ,(rx (or "x-data" "x-bind" "x-text")) @_name) (quoted_attribute_value (attribute_value) @cap)))))) #+end_src But the problem is that its highlighting is not visible because host html-ts-mode font-lock overrides embedded js-ts-mode font-lock. html-ts-mode--font-lock-settings contains: :language 'html :override t :feature 'string `((quoted_attribute_value) @font-lock-string-face) So only the whole attribute is highlighted by font-lock-string-face that overrides js highlighting. Probably there is no way to add ':override t' to all 'javascript' rules in 'js--treesit-font-lock-settings', like ':override t' is already added to all 'jsdoc' rules in 'js--treesit-font-lock-settings'.
bug-gnu-emacs <at> gnu.org
:bug#77255
; Package emacs
.
(Wed, 26 Mar 2025 07:31:02 GMT) Full text and rfc822 format available.Message #8 received at 77255 <at> debbugs.gnu.org (full text, mbox):
From: Juri Linkov <juri <at> linkov.net> To: 77255 <at> debbugs.gnu.org Cc: Yuan Fu <casouri <at> gmail.com>, Vincenzo Pupillo <v.pupillo <at> gmail.com> Subject: Re: bug#77255: Treesit font-lock override for embed ranges Date: Wed, 26 Mar 2025 09:27:14 +0200
> It looks like we need a new keyword like ':override t' for > 'treesit-range-rules' that would override host font-lock rules. Or maybe not. I managed to do this without any changes in core. Also not sure if such functions as treesit-merge-font-lock-feature-list and treesit-replace-font-lock-feature-settings could be used here. The solution below works simply by replacing the html rule with another rule that matches only HTML attributes that don't contain js code: #+begin_src emacs-lisp (setq-local treesit-range-settings (append treesit-range-settings (treesit-range-rules :embed 'javascript :host 'html :local t `((attribute (attribute_name) @_name (:match ,(rx (or "x-data" "x-bind" "x-text")) @_name) (quoted_attribute_value (attribute_value) @cap)))))) (setq-local treesit-font-lock-settings (mapcar (lambda (s) (if (and (eq (treesit-query-language (treesit-font-lock-setting-query s)) 'html) (eq (treesit-font-lock-setting-feature s) 'string)) (car (treesit-font-lock-rules :language 'html :override t :feature 'string `((attribute (attribute_name) @_name ;; (:match (not ,(rx (or "x-data" "x-bind" "x-text"))) @_name) ;; (:pred (lambda (node) ;; (not (string-match-p (rx (or "x-data" "x-bind" "x-text")) ;; (treesit-node-text node t)))) ;; @_name) (:pred mhtml-ts-mode--not-match @_name) (quoted_attribute_value) @font-lock-string-face)))) s)) treesit-font-lock-settings)) #+end_src The commented out code shows attempts to use a negated :match that is not supported. Also it seems a lambda for :pred is also not supported. So needed to add a separate function: #+begin_src emacs-lisp (defun mhtml-ts-mode--not-match (node) (not (string-match-p (rx (or "x-data" "x-bind" "x-text")) (treesit-node-text node t)))) #+end_src Then everything works: all HTML attributes are highlighted except those that should highlight js code in them.
bug-gnu-emacs <at> gnu.org
:bug#77255
; Package emacs
.
(Wed, 26 Mar 2025 16:09:01 GMT) Full text and rfc822 format available.Message #11 received at 77255 <at> debbugs.gnu.org (full text, mbox):
From: Vincenzo Pupillo <v.pupillo <at> gmail.com> To: 77255 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net> Cc: Yuan Fu <casouri <at> gmail.com> Subject: Re: bug#77255: Treesit font-lock override for embed ranges Date: Wed, 26 Mar 2025 17:07:58 +0100
Caio Juri, In data mercoledì 26 marzo 2025 08:27:14 Ora standard dell’Europa centrale, Juri Linkov ha scritto: > > It looks like we need a new keyword like ':override t' for > > 'treesit-range-rules' that would override host font-lock rules. > > Or maybe not. I managed to do this without any changes in core. > Also not sure if such functions as treesit-merge-font-lock-feature-list > and treesit-replace-font-lock-feature-settings could be used here. > treesit-replace-font-lock-feature-settings works reliably only if the replacement is done for rules of the same language. So something like: #+begin_src emacs-lisp (setq-local liquid-ts-mode--font-lock-feature-list (treesit-replace-font-lock- feature-settings (treesit-font-lock-rules :language 'html :override t :feature 'string `((attribute (attribute_name) @_name (:pred mhtml-ts-mode--not-match @_name) (quoted_attribute_value) @font-lock-string-face))) html-ts-mode--treesit-font-lock-settings) #+end_src Then: (defvar mhtml-ts-mode--treesit-font-lock-feature-list (treesit-merge-font-lock-feature-list liquid-ts-mode--treesit-font-lock-feature-list (treesit-merge-font-lock-feature-list js--treesit-font-lock-feature-list css--treesit-font-lock-feature-list)) "Settings for `treesit-font-lock-feature-list'.") However, we could modify treesit-replace-font-lock-feature-settings to check the language in addition to the feature. Vincenzo > The solution below works simply by replacing the html rule with > another rule that matches only HTML attributes that don't contain > js code: > > #+begin_src emacs-lisp > (setq-local treesit-range-settings > (append treesit-range-settings > (treesit-range-rules > > :embed 'javascript > :host 'html > :local t > > `((attribute > (attribute_name) @_name > (:match ,(rx (or "x-data" "x-bind" "x-text")) > @_name) (quoted_attribute_value > (attribute_value) @cap)))))) > > (setq-local treesit-font-lock-settings > (mapcar (lambda (s) > (if (and (eq (treesit-query-language > (treesit-font-lock-setting-query s)) > 'html) > (eq (treesit-font-lock-setting-feature s) > 'string)) > (car (treesit-font-lock-rules > > :language 'html > :override t > :feature 'string > > `((attribute > (attribute_name) @_name > ;; (:match (not ,(rx (or "x-data" > "x-bind" "x-text"))) @_name) ;; (:pred (lambda (node) > ;; (not (string-match-p (rx > (or "x-data" "x-bind" "x-text")) ;; (treesit-node-text node > t)))) ;; @_name) > (:pred mhtml-ts-mode--not-match > @_name) (quoted_attribute_value) @font-lock-string-face)))) s)) > treesit-font-lock-settings)) > #+end_src > > The commented out code shows attempts to use a negated :match > that is not supported. Also it seems a lambda for :pred is > also not supported. So needed to add a separate function: > > #+begin_src emacs-lisp > (defun mhtml-ts-mode--not-match (node) > (not (string-match-p (rx (or "x-data" "x-bind" "x-text")) > (treesit-node-text node t)))) > #+end_src > > Then everything works: all HTML attributes are highlighted > except those that should highlight js code in them.
bug-gnu-emacs <at> gnu.org
:bug#77255
; Package emacs
.
(Thu, 27 Mar 2025 04:21:07 GMT) Full text and rfc822 format available.Message #14 received at 77255 <at> debbugs.gnu.org (full text, mbox):
From: Yuan Fu <casouri <at> gmail.com> To: Juri Linkov <juri <at> linkov.net> Cc: 77255 <at> debbugs.gnu.org, Vincenzo Pupillo <v.pupillo <at> gmail.com> Subject: Re: bug#77255: Treesit font-lock override for embed ranges Date: Wed, 26 Mar 2025 21:20:31 -0700
> On Mar 26, 2025, at 12:27 AM, Juri Linkov <juri <at> linkov.net> wrote: > >> It looks like we need a new keyword like ':override t' for >> 'treesit-range-rules' that would override host font-lock rules. > > Or maybe not. I managed to do this without any changes in core. > Also not sure if such functions as treesit-merge-font-lock-feature-list > and treesit-replace-font-lock-feature-settings could be used here. > > The solution below works simply by replacing the html rule with > another rule that matches only HTML attributes that don't contain > js code: > > #+begin_src emacs-lisp > (setq-local treesit-range-settings > (append treesit-range-settings > (treesit-range-rules > :embed 'javascript > :host 'html > :local t > `((attribute > (attribute_name) @_name > (:match ,(rx (or "x-data" "x-bind" "x-text")) @_name) > (quoted_attribute_value > (attribute_value) @cap)))))) > > (setq-local treesit-font-lock-settings > (mapcar (lambda (s) > (if (and (eq (treesit-query-language > (treesit-font-lock-setting-query s)) > 'html) > (eq (treesit-font-lock-setting-feature s) > 'string)) > (car (treesit-font-lock-rules > :language 'html > :override t > :feature 'string > `((attribute > (attribute_name) @_name > ;; (:match (not ,(rx (or "x-data" "x-bind" "x-text"))) @_name) > ;; (:pred (lambda (node) > ;; (not (string-match-p (rx (or "x-data" "x-bind" "x-text")) > ;; (treesit-node-text node t)))) > ;; @_name) > (:pred mhtml-ts-mode--not-match @_name) > (quoted_attribute_value) @font-lock-string-face)))) > s)) > treesit-font-lock-settings)) > #+end_src > > The commented out code shows attempts to use a negated :match > that is not supported. Also it seems a lambda for :pred is > also not supported. So needed to add a separate function: > > #+begin_src emacs-lisp > (defun mhtml-ts-mode--not-match (node) > (not (string-match-p (rx (or "x-data" "x-bind" "x-text")) > (treesit-node-text node t)))) > #+end_src > > Then everything works: all HTML attributes are highlighted > except those that should highlight js code in them. Looks reasonable to me. But if it’s a minor mode, we might need to have a way to negate the change made to treesit-font-lock-settings? OTOH if we use :override, we might run into an override arm race when enabling multiple minor modes, etc. Yuan
bug-gnu-emacs <at> gnu.org
:bug#77255
; Package emacs
.
(Thu, 27 Mar 2025 19:09:02 GMT) Full text and rfc822 format available.Message #17 received at 77255 <at> debbugs.gnu.org (full text, mbox):
From: Juri Linkov <juri <at> linkov.net> To: Yuan Fu <casouri <at> gmail.com> Cc: 77255 <at> debbugs.gnu.org, Vincenzo Pupillo <v.pupillo <at> gmail.com> Subject: Re: bug#77255: Treesit font-lock override for embed ranges Date: Thu, 27 Mar 2025 21:04:10 +0200
>> The commented out code shows attempts to use a negated :match >> that is not supported. Also it seems a lambda for :pred is >> also not supported. So needed to add a separate function: >> >> #+begin_src emacs-lisp >> (defun mhtml-ts-mode--not-match (node) >> (not (string-match-p (rx (or "x-data" "x-bind" "x-text")) >> (treesit-node-text node t)))) >> #+end_src >> >> Then everything works: all HTML attributes are highlighted >> except those that should highlight js code in them. > > Looks reasonable to me. But if it’s a minor mode, we might need to > have a way to negate the change made to treesit-font-lock-settings? > OTOH if we use :override, we might run into an override arm race when > enabling multiple minor modes, etc. We could declare that the last minor mode wins. But indeed still need a way to restore the original treesit-font-lock-settings after disabling the minor mode. BTW, I found another problem. Please confirm if the range rules allow only one query per embed language, or I'm doing something wrong? I tried two queries to enable the liquid parser in html nodes 'text' and also in html attributes 'attribute_value': #+begin_src emacs-lisp (setq-local treesit-range-settings (append treesit-range-settings (treesit-range-rules :embed 'liquid :host 'html `(((text) @cap1 (:match ,(rx (or "{{" "}}" "{%" "%}")) @cap1)) ((quoted_attribute_value (attribute_value) @cap2) (:match ,(rx (or "{{" "}}" "{%" "%}")) @cap2)))))) #+end_src But it handles only one of these queries: when I remove the rule for (text), it handles attribute_value, but when I remove the rule for (attribute_value), it enables the liquid parser only for text. This revealed another problem. Actually, Liquid is a preprocessor. Since it can be embedded everywhere in every html node, not depending on the structure in the html parser, it would be more correct first to use the liquid parser, and then allow html+js+css parsers to handle remaining parts. But both liquid and html parsers should apply on the whole file. The only difference is that liquid has a higher precedence to decide what overlapping parts belong to the liquid parser. Or maybe it makes sense to have two primary parsers? They both could add own highlighting. And in regard to navigation, one of primary parsers could have a precedence.
bug-gnu-emacs <at> gnu.org
:bug#77255
; Package emacs
.
(Sat, 29 Mar 2025 08:25:02 GMT) Full text and rfc822 format available.Message #20 received at 77255 <at> debbugs.gnu.org (full text, mbox):
From: Yuan Fu <casouri <at> gmail.com> To: Juri Linkov <juri <at> linkov.net> Cc: 77255 <at> debbugs.gnu.org, Vincenzo Pupillo <v.pupillo <at> gmail.com> Subject: Re: bug#77255: Treesit font-lock override for embed ranges Date: Sat, 29 Mar 2025 01:23:47 -0700
> On Mar 27, 2025, at 12:04 PM, Juri Linkov <juri <at> linkov.net> wrote: > >>> The commented out code shows attempts to use a negated :match >>> that is not supported. Also it seems a lambda for :pred is >>> also not supported. So needed to add a separate function: >>> >>> #+begin_src emacs-lisp >>> (defun mhtml-ts-mode--not-match (node) >>> (not (string-match-p (rx (or "x-data" "x-bind" "x-text")) >>> (treesit-node-text node t)))) >>> #+end_src >>> >>> Then everything works: all HTML attributes are highlighted >>> except those that should highlight js code in them. >> >> Looks reasonable to me. But if it’s a minor mode, we might need to >> have a way to negate the change made to treesit-font-lock-settings? >> OTOH if we use :override, we might run into an override arm race when >> enabling multiple minor modes, etc. > > We could declare that the last minor mode wins. But indeed still need > a way to restore the original treesit-font-lock-settings after disabling > the minor mode. > > BTW, I found another problem. Please confirm if the range rules allow > only one query per embed language, or I'm doing something wrong? That’s curious, even if you included multiple patterns in the query, it’s still one query; and the range functions support multiple captured ranges when setting up ranges. So something is wrong here. (See treesit-query-range) I can look into this, but give me a few days. > I tried two queries to enable the liquid parser in html nodes 'text' > and also in html attributes 'attribute_value': > > #+begin_src emacs-lisp > (setq-local treesit-range-settings > (append treesit-range-settings > (treesit-range-rules > :embed 'liquid > :host 'html > `(((text) @cap1 > (:match ,(rx (or "{{" "}}" "{%" "%}")) @cap1)) > ((quoted_attribute_value > (attribute_value) @cap2) > (:match ,(rx (or "{{" "}}" "{%" "%}")) @cap2)))))) > #+end_src > > But it handles only one of these queries: when I remove the rule > for (text), it handles attribute_value, but when I remove the rule > for (attribute_value), it enables the liquid parser only for text. > > This revealed another problem. Actually, Liquid is a preprocessor. > Since it can be embedded everywhere in every html node, not depending on > the structure in the html parser, it would be more correct first to use > the liquid parser, and then allow html+js+css parsers to handle > remaining parts. But both liquid and html parsers should apply > on the whole file. The only difference is that liquid has a higher > precedence to decide what overlapping parts belong to the liquid parser. > Or maybe it makes sense to have two primary parsers? They both could > add own highlighting. And in regard to navigation, one of primary > parsers could have a precedence. IMO preprocessor definitely should be the primary parser and let HTML embed in it. In the case of Liquid, it happens to uses a syntax that’s compatible to HTML; that’s fine, but it’s worth it or even necessary to add support for multiple primary parsers because of it. As for precedence, it can be customized by treesit-language-at-point-function. Yuan
bug-gnu-emacs <at> gnu.org
:bug#77255
; Package emacs
.
(Mon, 31 Mar 2025 17:06:01 GMT) Full text and rfc822 format available.Message #23 received at 77255 <at> debbugs.gnu.org (full text, mbox):
From: Juri Linkov <juri <at> linkov.net> To: Yuan Fu <casouri <at> gmail.com> Cc: 77255 <at> debbugs.gnu.org, Vincenzo Pupillo <v.pupillo <at> gmail.com> Subject: Re: bug#77255: Treesit font-lock override for embed ranges Date: Mon, 31 Mar 2025 19:57:18 +0300
>>> Looks reasonable to me. But if it’s a minor mode, we might need to >>> have a way to negate the change made to treesit-font-lock-settings? >>> OTOH if we use :override, we might run into an override arm race when >>> enabling multiple minor modes, etc. >> >> We could declare that the last minor mode wins. But indeed still need >> a way to restore the original treesit-font-lock-settings after disabling >> the minor mode. I'm convinced now that minor modes should be avoided since it's not straightforward to revert the original settings when they are disabled. Everything works nicely in the attached example of the liquid major mode where liquid is the primary parser. Currently it copies settings from mhtml-ts-mode. But later I'll try to inherit from mhtml-ts-mode. >> BTW, I found another problem. Please confirm if the range rules allow >> only one query per embed language, or I'm doing something wrong? > > That’s curious, even if you included multiple patterns in the query, it’s > still one query; and the range functions support multiple captured ranges > when setting up ranges. So something is wrong here. (See > treesit-query-range) I can look into this, but give me a few days. Multiple captured ranges are not needed anymore for the attached example. >> This revealed another problem. Actually, Liquid is a preprocessor. >> Since it can be embedded everywhere in every html node, not depending on >> the structure in the html parser, it would be more correct first to use >> the liquid parser, and then allow html+js+css parsers to handle >> remaining parts. But both liquid and html parsers should apply >> on the whole file. The only difference is that liquid has a higher >> precedence to decide what overlapping parts belong to the liquid parser. >> Or maybe it makes sense to have two primary parsers? They both could >> add own highlighting. And in regard to navigation, one of primary >> parsers could have a precedence. > > IMO preprocessor definitely should be the primary parser and let HTML embed > in it. In the case of Liquid, it happens to uses a syntax that’s compatible > to HTML; that’s fine, but it’s worth it or even necessary to add support > for multiple primary parsers because of it. As for precedence, it can be > customized by treesit-language-at-point-function. Thanks for the suggestion to use the preprocessor as the primary parser. So multiple primary parsers are not required anymore since other parsers are embedded to the primary parser ('define-treesit-generic-mode' sets the primary parser). And everything works for any embedded level: liquid -> html -> js -> jsdoc liquid -> html -> css liquid -> yaml #+begin_src emacs-lisp (define-treesit-generic-mode liquid-generic-ts-mode "Tree-sitter generic mode for Liquid templates." :lang 'liquid :source "https://github.com/hankthetank27/tree-sitter-liquid" :mode-remap '(html-mode mhtml-mode html-ts-mode mhtml-ts-mode) :name "Liquid" ;; TODO: :parent mhtml-ts-mode (treesit-parser-create 'html) (treesit-parser-create 'css) (treesit-parser-create 'javascript) (setq-local treesit-range-settings (treesit-range-rules :embed 'html :host 'liquid '(((template_content) @cap)) :embed 'javascript :host 'liquid '(((js_content) @cap)) :embed 'css :host 'liquid '(((style_content) @cap)) :embed 'javascript :host 'html '((script_element (start_tag (tag_name)) (raw_text) @cap)) :embed 'css :host 'html '((style_element (start_tag (tag_name)) (raw_text) @cap)))) (when (treesit-ready-p 'yaml t) (treesit-parser-create 'yaml) (setq-local treesit-range-settings (append treesit-range-settings (treesit-range-rules :embed 'yaml :host 'liquid '(((front_matter) @cap)))))) (setq-local treesit-font-lock-settings (append treesit-font-lock-settings html-ts-mode--font-lock-settings js--treesit-font-lock-settings (treesit-replace-font-lock-feature-settings (treesit-font-lock-rules :language 'css :override t :feature 'variable '((plain_value) @mhtml-ts-mode--colorize-css-value (color_value) @mhtml-ts-mode--colorize-css-value)) css--treesit-settings))) (setq-local treesit-font-lock-feature-list (treesit-merge-font-lock-feature-list treesit-font-lock-feature-list (treesit-merge-font-lock-feature-list html-ts-mode--treesit-font-lock-feature-list (treesit-merge-font-lock-feature-list js--treesit-font-lock-feature-list css--treesit-font-lock-feature-list)))) (when (treesit-ready-p 'jsdoc t) (treesit-parser-create 'jsdoc) (setq-local treesit-range-settings (append treesit-range-settings (treesit-range-rules :embed 'jsdoc :host 'javascript :local t `(((comment) @cap (:match ,js--treesit-jsdoc-beginning-regexp @cap))))))) (setq treesit-thing-settings (append `((liquid (sexp (not ,(rx bos (or "program") eos))) (list ,(rx bos (or "range" "if_statement" "for_loop_statement" "case_statement" "unless_statement" "capture_statement" "form_statement" "tablerow_statement" "paginate_statement") eos)))) mhtml-ts-mode--treesit-thing-settings)) (setq-local treesit-aggregated-outline-predicate `((liquid . ,(rx bos (or "if_statement" "for_loop_statement") eos)) (html . ,#'html-ts-mode--outline-predicate) (javascript . ,js-ts-mode--outline-predicate) (css . ,css-ts-mode--outline-predicate)))) #+end_src
bug-gnu-emacs <at> gnu.org
:bug#77255
; Package emacs
.
(Tue, 01 Apr 2025 00:39:03 GMT) Full text and rfc822 format available.Message #26 received at 77255 <at> debbugs.gnu.org (full text, mbox):
From: Yuan Fu <casouri <at> gmail.com> To: Juri Linkov <juri <at> linkov.net> Cc: 77255 <at> debbugs.gnu.org, Vincenzo Pupillo <v.pupillo <at> gmail.com> Subject: Re: bug#77255: Treesit font-lock override for embed ranges Date: Mon, 31 Mar 2025 17:38:00 -0700
> On Mar 31, 2025, at 9:57 AM, Juri Linkov <juri <at> linkov.net> wrote: > >>>> Looks reasonable to me. But if it’s a minor mode, we might need to >>>> have a way to negate the change made to treesit-font-lock-settings? >>>> OTOH if we use :override, we might run into an override arm race when >>>> enabling multiple minor modes, etc. >>> >>> We could declare that the last minor mode wins. But indeed still need >>> a way to restore the original treesit-font-lock-settings after disabling >>> the minor mode. > > I'm convinced now that minor modes should be avoided since it's not > straightforward to revert the original settings when they are disabled. > > Everything works nicely in the attached example of the liquid major mode > where liquid is the primary parser. Currently it copies settings > from mhtml-ts-mode. But later I'll try to inherit from mhtml-ts-mode. > >>> BTW, I found another problem. Please confirm if the range rules allow >>> only one query per embed language, or I'm doing something wrong? >> >> That’s curious, even if you included multiple patterns in the query, it’s >> still one query; and the range functions support multiple captured ranges >> when setting up ranges. So something is wrong here. (See >> treesit-query-range) I can look into this, but give me a few days. > > Multiple captured ranges are not needed anymore for the attached example. > >>> This revealed another problem. Actually, Liquid is a preprocessor. >>> Since it can be embedded everywhere in every html node, not depending on >>> the structure in the html parser, it would be more correct first to use >>> the liquid parser, and then allow html+js+css parsers to handle >>> remaining parts. But both liquid and html parsers should apply >>> on the whole file. The only difference is that liquid has a higher >>> precedence to decide what overlapping parts belong to the liquid parser. >>> Or maybe it makes sense to have two primary parsers? They both could >>> add own highlighting. And in regard to navigation, one of primary >>> parsers could have a precedence. >> >> IMO preprocessor definitely should be the primary parser and let HTML embed >> in it. In the case of Liquid, it happens to uses a syntax that’s compatible >> to HTML; that’s fine, but it’s worth it or even necessary to add support >> for multiple primary parsers because of it. As for precedence, it can be >> customized by treesit-language-at-point-function. > > Thanks for the suggestion to use the preprocessor as the primary parser. > So multiple primary parsers are not required anymore since other parsers > are embedded to the primary parser ('define-treesit-generic-mode' sets > the primary parser). And everything works for any embedded level: > > liquid -> html -> js -> jsdoc > liquid -> html -> css > liquid -> yaml > Awesome!
bug-gnu-emacs <at> gnu.org
:bug#77255
; Package emacs
.
(Tue, 01 Apr 2025 17:23:02 GMT) Full text and rfc822 format available.Message #29 received at 77255 <at> debbugs.gnu.org (full text, mbox):
From: Juri Linkov <juri <at> linkov.net> To: Vincenzo Pupillo <v.pupillo <at> gmail.com> Cc: 77255 <at> debbugs.gnu.org, Yuan Fu <casouri <at> gmail.com> Subject: Re: bug#77255: Treesit font-lock override for embed ranges Date: Tue, 01 Apr 2025 20:17:58 +0300
> However, we could modify treesit-replace-font-lock-feature-settings to check > the language in addition to the feature. Thanks for the suggestion. So I modified treesit-replace-font-lock-feature-settings to check the language as well.
bug-gnu-emacs <at> gnu.org
:bug#77255
; Package emacs
.
(Tue, 01 Apr 2025 17:24:02 GMT) Full text and rfc822 format available.Message #32 received at 77255 <at> debbugs.gnu.org (full text, mbox):
From: Juri Linkov <juri <at> linkov.net> To: Yuan Fu <casouri <at> gmail.com> Cc: 77255 <at> debbugs.gnu.org, Vincenzo Pupillo <v.pupillo <at> gmail.com> Subject: Re: bug#77255: Treesit font-lock override for embed ranges Date: Tue, 01 Apr 2025 20:18:36 +0300
close 77255 31.0.50 thanks >> liquid -> html -> js -> jsdoc >> liquid -> html -> css >> liquid -> yaml > > Awesome! So now added to treesit-x.el and closed.
Juri Linkov <juri <at> linkov.net>
to control <at> debbugs.gnu.org
.
(Tue, 01 Apr 2025 17:24:02 GMT) Full text and rfc822 format available.bug-gnu-emacs <at> gnu.org
:bug#77255
; Package emacs
.
(Tue, 01 Apr 2025 17:45:03 GMT) Full text and rfc822 format available.Message #37 received at 77255 <at> debbugs.gnu.org (full text, mbox):
From: Juri Linkov <juri <at> linkov.net> To: Yuan Fu <casouri <at> gmail.com> Cc: 77255 <at> debbugs.gnu.org, Vincenzo Pupillo <v.pupillo <at> gmail.com> Subject: Re: bug#77255: Treesit font-lock override for embed ranges Date: Tue, 01 Apr 2025 20:43:34 +0300
> Everything works nicely in the attached example of the liquid major mode > where liquid is the primary parser. Currently it copies settings > from mhtml-ts-mode. But later I'll try to inherit from mhtml-ts-mode. However, with inheritance that causes several changes of the primary parser and several calls to treesit-major-mode-setup during the ts-mode initialization, sometimes I get such backtraces only after the first edit: Debugger entered--Lisp error: (treesit-node-outdated #<treesit-node-outdated>) #<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_78>(beg #<treesit-node-outdated>) treesit-navigate-thing(35 1 beg html-ts-mode--outline-predicate) Could you suggest where to look?
bug-gnu-emacs <at> gnu.org
:bug#77255
; Package emacs
.
(Tue, 01 Apr 2025 19:19:01 GMT) Full text and rfc822 format available.Message #40 received at 77255 <at> debbugs.gnu.org (full text, mbox):
From: Vincenzo Pupillo <v.pupillo <at> gmail.com> To: Yuan Fu <casouri <at> gmail.com>, Juri Linkov <juri <at> linkov.net> Cc: 77255 <at> debbugs.gnu.org Subject: Re: bug#77255: Treesit font-lock override for embed ranges Date: Tue, 01 Apr 2025 21:17:55 +0200
Wow!!! Great work! In data lunedì 31 marzo 2025 18:57:18 Ora legale dell’Europa centrale, Juri Linkov ha scritto: > >>> Looks reasonable to me. But if it’s a minor mode, we might need to > >>> have a way to negate the change made to treesit-font-lock-settings? > >>> OTOH if we use :override, we might run into an override arm race when > >>> enabling multiple minor modes, etc. > >> > >> We could declare that the last minor mode wins. But indeed still need > >> a way to restore the original treesit-font-lock-settings after disabling > >> the minor mode. > > I'm convinced now that minor modes should be avoided since it's not > straightforward to revert the original settings when they are disabled. > > Everything works nicely in the attached example of the liquid major mode > where liquid is the primary parser. Currently it copies settings > from mhtml-ts-mode. But later I'll try to inherit from mhtml-ts-mode. > > >> BTW, I found another problem. Please confirm if the range rules allow > >> only one query per embed language, or I'm doing something wrong? > > > > That’s curious, even if you included multiple patterns in the query, it’s > > still one query; and the range functions support multiple captured ranges > > when setting up ranges. So something is wrong here. (See > > treesit-query-range) I can look into this, but give me a few days. > > Multiple captured ranges are not needed anymore for the attached example. > > >> This revealed another problem. Actually, Liquid is a preprocessor. > >> Since it can be embedded everywhere in every html node, not depending on > >> the structure in the html parser, it would be more correct first to use > >> the liquid parser, and then allow html+js+css parsers to handle > >> remaining parts. But both liquid and html parsers should apply > >> on the whole file. The only difference is that liquid has a higher > >> precedence to decide what overlapping parts belong to the liquid parser. > >> Or maybe it makes sense to have two primary parsers? They both could > >> add own highlighting. And in regard to navigation, one of primary > >> parsers could have a precedence. > > > > IMO preprocessor definitely should be the primary parser and let HTML > > embed > > in it. In the case of Liquid, it happens to uses a syntax that’s > > compatible > > to HTML; that’s fine, but it’s worth it or even necessary to add support > > for multiple primary parsers because of it. As for precedence, it can be > > customized by treesit-language-at-point-function. > > Thanks for the suggestion to use the preprocessor as the primary parser. > So multiple primary parsers are not required anymore since other parsers > are embedded to the primary parser ('define-treesit-generic-mode' sets > the primary parser). And everything works for any embedded level: > > liquid -> html -> js -> jsdoc > liquid -> html -> css > liquid -> yaml > > #+begin_src emacs-lisp > (define-treesit-generic-mode liquid-generic-ts-mode > "Tree-sitter generic mode for Liquid templates." > > :lang 'liquid > :source "https://github.com/hankthetank27/tree-sitter-liquid" > :mode-remap '(html-mode mhtml-mode html-ts-mode mhtml-ts-mode) > :name "Liquid" > > ;; TODO: :parent mhtml-ts-mode > > (treesit-parser-create 'html) > (treesit-parser-create 'css) > (treesit-parser-create 'javascript) > > (setq-local treesit-range-settings > (treesit-range-rules > > :embed 'html > :host 'liquid > > '(((template_content) @cap)) > > :embed 'javascript > :host 'liquid > > '(((js_content) @cap)) > > :embed 'css > :host 'liquid > > '(((style_content) @cap)) > > :embed 'javascript > :host 'html > > '((script_element > (start_tag (tag_name)) > (raw_text) @cap)) > > :embed 'css > :host 'html > > '((style_element > (start_tag (tag_name)) > (raw_text) @cap)))) > > (when (treesit-ready-p 'yaml t) > (treesit-parser-create 'yaml) > (setq-local treesit-range-settings > (append treesit-range-settings > (treesit-range-rules > > :embed 'yaml > :host 'liquid > > '(((front_matter) @cap)))))) > > (setq-local treesit-font-lock-settings > (append treesit-font-lock-settings > html-ts-mode--font-lock-settings > js--treesit-font-lock-settings > (treesit-replace-font-lock-feature-settings > (treesit-font-lock-rules > > :language 'css > :override t > :feature 'variable > > '((plain_value) @mhtml-ts-mode--colorize-css-value > (color_value) @mhtml-ts-mode--colorize-css-value)) > css--treesit-settings))) > > (setq-local treesit-font-lock-feature-list > (treesit-merge-font-lock-feature-list > treesit-font-lock-feature-list > (treesit-merge-font-lock-feature-list > html-ts-mode--treesit-font-lock-feature-list > (treesit-merge-font-lock-feature-list > js--treesit-font-lock-feature-list > css--treesit-font-lock-feature-list)))) > > (when (treesit-ready-p 'jsdoc t) > (treesit-parser-create 'jsdoc) > (setq-local treesit-range-settings > (append treesit-range-settings > (treesit-range-rules > > :embed 'jsdoc > :host 'javascript > :local t > > `(((comment) @cap > (:match ,js--treesit-jsdoc-beginning-regexp > @cap))))))) > > (setq treesit-thing-settings > (append > `((liquid (sexp (not ,(rx bos (or "program") eos))) > (list ,(rx bos (or "range" > "if_statement" > "for_loop_statement" > "case_statement" > "unless_statement" > "capture_statement" > "form_statement" > "tablerow_statement" > "paginate_statement") > eos)))) > mhtml-ts-mode--treesit-thing-settings)) > > (setq-local treesit-aggregated-outline-predicate > `((liquid . ,(rx bos (or "if_statement" > "for_loop_statement") > eos)) > (html . ,#'html-ts-mode--outline-predicate) > (javascript . ,js-ts-mode--outline-predicate) > (css . ,css-ts-mode--outline-predicate)))) > #+end_src
bug-gnu-emacs <at> gnu.org
:bug#77255
; Package emacs
.
(Tue, 01 Apr 2025 23:42:02 GMT) Full text and rfc822 format available.Message #43 received at 77255 <at> debbugs.gnu.org (full text, mbox):
From: Yuan Fu <casouri <at> gmail.com> To: Juri Linkov <juri <at> linkov.net> Cc: 77255 <at> debbugs.gnu.org, Vincenzo Pupillo <v.pupillo <at> gmail.com> Subject: Re: bug#77255: Treesit font-lock override for embed ranges Date: Tue, 1 Apr 2025 16:41:13 -0700
> On Apr 1, 2025, at 10:43 AM, Juri Linkov <juri <at> linkov.net> wrote: > >> Everything works nicely in the attached example of the liquid major mode >> where liquid is the primary parser. Currently it copies settings >> from mhtml-ts-mode. But later I'll try to inherit from mhtml-ts-mode. > > However, with inheritance that causes several changes of the > primary parser and several calls to treesit-major-mode-setup > during the ts-mode initialization, sometimes I get such backtraces > only after the first edit: > > Debugger entered--Lisp error: (treesit-node-outdated #<treesit-node-outdated>) > #<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_78>(beg #<treesit-node-outdated>) > treesit-navigate-thing(35 1 beg html-ts-mode--outline-predicate) > > Could you suggest where to look? Could it be that you passed a lambda/closure to treesit-navigate-thing which contains a tree-sitter node? Yuan
bug-gnu-emacs <at> gnu.org
:bug#77255
; Package emacs
.
(Wed, 02 Apr 2025 07:01:03 GMT) Full text and rfc822 format available.Message #46 received at 77255 <at> debbugs.gnu.org (full text, mbox):
From: Juri Linkov <juri <at> linkov.net> To: Yuan Fu <casouri <at> gmail.com> Cc: 77255 <at> debbugs.gnu.org, Vincenzo Pupillo <v.pupillo <at> gmail.com> Subject: Re: bug#77255: Treesit font-lock override for embed ranges Date: Wed, 02 Apr 2025 09:55:18 +0300
>>> Everything works nicely in the attached example of the liquid major mode >>> where liquid is the primary parser. Currently it copies settings >>> from mhtml-ts-mode. But later I'll try to inherit from mhtml-ts-mode. >> >> However, with inheritance that causes several changes of the >> primary parser and several calls to treesit-major-mode-setup >> during the ts-mode initialization, sometimes I get such backtraces >> only after the first edit: >> >> Debugger entered--Lisp error: (treesit-node-outdated #<treesit-node-outdated>) >> #<subr F616e6f6e796d6f75732d6c616d626461_anonymous_lambda_78>(beg #<treesit-node-outdated>) >> treesit-navigate-thing(35 1 beg html-ts-mode--outline-predicate) >> >> Could you suggest where to look? > > Could it be that you passed a lambda/closure to treesit-navigate-thing > which contains a tree-sitter node? Thanks, indeed this is because outline-minor-mode was activated too early on the wrong hook. It should be activated on the last hook, i.e. liquid-generic-ts-mode-hook instead of the parent's mhtml-ts-mode-hook.
bug-gnu-emacs <at> gnu.org
:bug#77255
; Package emacs
.
(Wed, 09 Apr 2025 17:34:02 GMT) Full text and rfc822 format available.Message #49 received at 77255 <at> debbugs.gnu.org (full text, mbox):
From: Juri Linkov <juri <at> linkov.net> To: Yuan Fu <casouri <at> gmail.com> Cc: 77255 <at> debbugs.gnu.org, Vincenzo Pupillo <v.pupillo <at> gmail.com> Subject: Re: bug#77255: Treesit font-lock override for embed ranges Date: Wed, 09 Apr 2025 20:31:28 +0300
BTW, for testing multi-parser ranges I used such a trick that highlights different ranges using different background colors from hi-lock. Maybe something like this could be added to 'treesit-explore-mode' or 'treesit-inspect-mode': diff --git a/lisp/treesit.el b/lisp/treesit.el index 8e57a6dae14..5a2721cdda4 100644 --- a/lisp/treesit.el +++ b/lisp/treesit.el @@ -1036,6 +1026,9 @@ treesit--update-ranges-non-local (overlay-put ov 'treesit-parser embed-parser) (overlay-put ov 'treesit-parser-local-p nil) (overlay-put ov 'treesit-host-parser host-parser) + ;; (overlay-put ov 'font-lock-face (nth embed-level hi-lock-face-defaults)) + (overlay-put ov 'font-lock-face (nth (length (memq embed-parser (treesit-parser-list))) hi-lock-face-defaults)) + (overlay-put ov 'priority (+ 1000 embed-level)) (overlay-put ov 'treesit-parser-ov-timestamp modified-tick))))) ;; Set ranges for the embed parser. @@ -1130,6 +1123,9 @@ treesit--update-ranges-local (treesit-parser-set-embed-level embedded-parser embed-level) (overlay-put ov 'treesit-parser embedded-parser) (overlay-put ov 'treesit-parser-local-p t) + ;; (overlay-put ov 'font-lock-face (nth embed-level hi-lock-face-defaults)) + (overlay-put ov 'font-lock-face (nth (length (memq embedded-parser (treesit-parser-list))) hi-lock-face-defaults)) + (overlay-put ov 'priority (+ 1000 embed-level)) (overlay-put ov 'treesit-host-parser host-parser) (overlay-put ov 'treesit-parser-ov-timestamp modified-tick)
bug-gnu-emacs <at> gnu.org
:bug#77255
; Package emacs
.
(Thu, 17 Apr 2025 23:41:02 GMT) Full text and rfc822 format available.Message #52 received at 77255 <at> debbugs.gnu.org (full text, mbox):
From: Yuan Fu <casouri <at> gmail.com> To: Juri Linkov <juri <at> linkov.net> Cc: 77255 <at> debbugs.gnu.org, Vincenzo Pupillo <v.pupillo <at> gmail.com> Subject: Re: bug#77255: Treesit font-lock override for embed ranges Date: Thu, 17 Apr 2025 16:40:24 -0700
> On Apr 9, 2025, at 10:31 AM, Juri Linkov <juri <at> linkov.net> wrote: > > BTW, for testing multi-parser ranges I used such a trick that > highlights different ranges using different background colors from hi-lock. > Maybe something like this could be added to 'treesit-explore-mode' > or 'treesit-inspect-mode': > > diff --git a/lisp/treesit.el b/lisp/treesit.el > index 8e57a6dae14..5a2721cdda4 100644 > --- a/lisp/treesit.el > +++ b/lisp/treesit.el > @@ -1036,6 +1026,9 @@ treesit--update-ranges-non-local > (overlay-put ov 'treesit-parser embed-parser) > (overlay-put ov 'treesit-parser-local-p nil) > (overlay-put ov 'treesit-host-parser host-parser) > + ;; (overlay-put ov 'font-lock-face (nth embed-level hi-lock-face-defaults)) > + (overlay-put ov 'font-lock-face (nth (length (memq embed-parser (treesit-parser-list))) hi-lock-face-defaults)) > + (overlay-put ov 'priority (+ 1000 embed-level)) > (overlay-put ov 'treesit-parser-ov-timestamp > modified-tick))))) > ;; Set ranges for the embed parser. > @@ -1130,6 +1123,9 @@ treesit--update-ranges-local > (treesit-parser-set-embed-level embedded-parser embed-level) > (overlay-put ov 'treesit-parser embedded-parser) > (overlay-put ov 'treesit-parser-local-p t) > + ;; (overlay-put ov 'font-lock-face (nth embed-level hi-lock-face-defaults)) > + (overlay-put ov 'font-lock-face (nth (length (memq embedded-parser (treesit-parser-list))) hi-lock-face-defaults)) > + (overlay-put ov 'priority (+ 1000 embed-level)) > (overlay-put ov 'treesit-host-parser host-parser) > (overlay-put ov 'treesit-parser-ov-timestamp > modified-tick) Yes, that’ll be a fantastic addition. I thought about something like this but didn’t have the time to implement it. Yuan
Debbugs Internal Request <help-debbugs <at> gnu.org>
to internal_control <at> debbugs.gnu.org
.
(Fri, 16 May 2025 11:24:09 GMT) Full text and rfc822 format available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.