GNU bug report logs -
#62951
29.0.90; c-ts-mode: Incorrect fontification due to FOR_EACH_TAIL_SAFE
Previous Next
Reported by: Eli Zaretskii <eliz <at> gnu.org>
Date: Wed, 19 Apr 2023 16:41:01 UTC
Severity: normal
Found in version 29.0.90
Fixed in version 29.1
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 62951 in the body.
You can then email your comments to 62951 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62951
; Package
emacs
.
(Wed, 19 Apr 2023 16:41:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Eli Zaretskii <eliz <at> gnu.org>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Wed, 19 Apr 2023 16:41:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
To reproduce:
emacs -Q
C-x C-f src/fns.c RET
C-u 3365 M-g g
Observe that "if" and "STRINGP" in the body of FOR_EACH_TAIL_SAFE are
fontified in font-lock-function-name-face. This is because the
FOR_EACH_TAIL_SAFE macro is misparsed by tree-sitter as a declaration.
Can we teach c-ts-mode to recognize FOR_EACH_TAIL_SAFE and
FOR_EACH_TAIL for what they are, perhaps conditioned on
c-ts-mode-emacs-sources-support being non-nil?
In GNU Emacs 29.0.90 (build 25, i686-pc-mingw32) of 2023-04-17 built on
HOME-C4E4A596F7
Repository revision: 2f59595f5f4e86a791c425a6f9e2c1594d6813e4
Repository branch: emacs-29
Windowing system distributor 'Microsoft Corp.', version 5.1.2600
System Description: Microsoft Windows XP Service Pack 3 (v5.1.0.2600)
Configured using:
'configure -C --prefix=/d/usr --with-wide-int
--enable-checking=yes,glyphs 'CFLAGS=-O0 -gdwarf-4 -g3''
Configured features:
ACL GIF GMP GNUTLS HARFBUZZ JPEG JSON LCMS2 LIBXML2 MODULES NOTIFY
W32NOTIFY PDUMPER PNG RSVG SOUND SQLITE3 THREADS TIFF
TOOLKIT_SCROLL_BARS TREE_SITTER WEBP XPM ZLIB
Important settings:
value of $LANG: ENU
locale-coding-system: cp1255
Major mode: C/*
Minor modes in effect:
tooltip-mode: t
global-eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
line-number-mode: t
indent-tabs-mode: t
transient-mark-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
Load-path shadows:
None found.
Features:
(shadow sort mail-extr emacsbug message mailcap yank-media puny dired
dired-loaddefs rfc822 mml mml-sec password-cache epa derived epg rfc6068
epg-config gnus-util text-property-search mm-decode mm-bodies mm-encode
mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047
rfc2045 ietf-drums mm-util mail-prsvr mail-utils time-date subr-x pp
wid-edit descr-text help-mode c-ts-mode c-ts-common treesit cl-seq
vc-git diff-mode easy-mmode vc vc-dispatcher bug-reference byte-opt gv
bytecomp byte-compile cc-mode cc-fonts cc-guess cc-menus cc-cmds
cc-styles cc-align cc-engine cc-vars cc-defs cl-loaddefs cl-lib rmc
iso-transl tooltip cconv eldoc paren electric uniquify ediff-hook
vc-hooks lisp-float-type elisp-mode mwheel dos-w32 ls-lisp disp-table
term/w32-win w32-win w32-vars term/common-win tool-bar dnd fontset image
regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode
prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu
timer select scroll-bar mouse jit-lock font-lock syntax font-core
term/tty-colors frame minibuffer nadvice seq simple cl-generic
indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
composite emoji-zwj charscript charprop case-table epa-hook
jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
theme-loaddefs faces cus-face macroexp files window text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget keymap
hashtable-print-readable backquote threads w32notify w32 lcms2 multi-tty
make-network-process emacs)
Memory information:
((conses 16 100876 13128)
(symbols 48 10122 0)
(strings 16 31511 2068)
(string-bytes 1 995836)
(vectors 16 17200)
(vector-slots 8 233602 14782)
(floats 8 43 173)
(intervals 40 2885 151)
(buffers 888 16))
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62951
; Package
emacs
.
(Fri, 21 Apr 2023 20:38:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 62951 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
> To reproduce:
>
> emacs -Q
> C-x C-f src/fns.c RET
> C-u 3365 M-g g
>
> Observe that "if" and "STRINGP" in the body of FOR_EACH_TAIL_SAFE are
> fontified in font-lock-function-name-face. This is because the
> FOR_EACH_TAIL_SAFE macro is misparsed by tree-sitter as a declaration.
>
> Can we teach c-ts-mode to recognize FOR_EACH_TAIL_SAFE and
> FOR_EACH_TAIL for what they are, perhaps conditioned on
> c-ts-mode-emacs-sources-support being non-nil?
I’m aware of this issue, but the truth is there isn’t a good solution to
it. We need to recognize FOR_EACH_TAIL_SAFE (not hard) and fix arbitrary
code after it (hard). In this case it’s a if statement, with macro calls
and AND operation in it’s condition, it’s already three things we need
to recognize and somehow handle. It can also be a for loop, a switch
case, a function call, a while loop. If we want to fix FOR_EACH_TAIL we
would need to handle every possible thing, at that point we might as
well have wrote a parser :-)
We can probably fix this very particular case, but it’s still work and
overhead, and doesn’t mean much.
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62951
; Package
emacs
.
(Sat, 22 Apr 2023 07:18:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 62951 <at> debbugs.gnu.org (full text, mbox):
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Fri, 21 Apr 2023 13:37:08 -0700
> Cc: 62951 <at> debbugs.gnu.org
>
>
> Eli Zaretskii <eliz <at> gnu.org> writes:
>
> > To reproduce:
> >
> > emacs -Q
> > C-x C-f src/fns.c RET
> > C-u 3365 M-g g
> >
> > Observe that "if" and "STRINGP" in the body of FOR_EACH_TAIL_SAFE are
> > fontified in font-lock-function-name-face. This is because the
> > FOR_EACH_TAIL_SAFE macro is misparsed by tree-sitter as a declaration.
> >
> > Can we teach c-ts-mode to recognize FOR_EACH_TAIL_SAFE and
> > FOR_EACH_TAIL for what they are, perhaps conditioned on
> > c-ts-mode-emacs-sources-support being non-nil?
>
> I’m aware of this issue, but the truth is there isn’t a good solution to
> it. We need to recognize FOR_EACH_TAIL_SAFE (not hard) and fix arbitrary
> code after it (hard). In this case it’s a if statement, with macro calls
> and AND operation in it’s condition, it’s already three things we need
> to recognize and somehow handle. It can also be a for loop, a switch
> case, a function call, a while loop. If we want to fix FOR_EACH_TAIL we
> would need to handle every possible thing, at that point we might as
> well have wrote a parser :-)
Sorry, I don't understand the difficulties. The body of FOR_EACH_TAIL
and a few similar macros we use could be on of the following:
. a single simple statement
. an 'if' clause
. a 'while' loop
. a 'do' loop
. a 'for' loop
. a brace-delimited block (this one already works, AFAICS, so we
perhaps need not anything about it)
(In practice, only the first 2 and the last one are used, AFAICS.)
Doesn't tree-sitter tell us enough to figure out which of the above is
in the body? If so, why would we need to write a full parser?
> We can probably fix this very particular case, but it’s still work and
> overhead, and doesn’t mean much.
Please understand: good support for editing Emacs C sources is from my
POV imperative for c-ts-mode to gain traction. One of my gripes about
CC Mode was insufficient support for our macro system and for various
GCC attributes; that improved recently to some extend, but not enough,
and at a price of introducing ugly lists of strings that cause trouble
when used in file-local variables. We must do better in c-ts-mode!
So please make an effort of providing reasonable built-in solutions
for these idiosyncrasies of the Emacs C sources, conditioned on the
new variable c-ts-mode-emacs-sources-support, at least for those of
them that are used widely enough. It is very important.
TIA
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62951
; Package
emacs
.
(Sun, 23 Apr 2023 00:29:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 62951 <at> debbugs.gnu.org (full text, mbox):
> On Apr 22, 2023, at 12:17 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Fri, 21 Apr 2023 13:37:08 -0700
>> Cc: 62951 <at> debbugs.gnu.org
>>
>>
>> Eli Zaretskii <eliz <at> gnu.org> writes:
>>
>>> To reproduce:
>>>
>>> emacs -Q
>>> C-x C-f src/fns.c RET
>>> C-u 3365 M-g g
>>>
>>> Observe that "if" and "STRINGP" in the body of FOR_EACH_TAIL_SAFE are
>>> fontified in font-lock-function-name-face. This is because the
>>> FOR_EACH_TAIL_SAFE macro is misparsed by tree-sitter as a declaration.
>>>
>>> Can we teach c-ts-mode to recognize FOR_EACH_TAIL_SAFE and
>>> FOR_EACH_TAIL for what they are, perhaps conditioned on
>>> c-ts-mode-emacs-sources-support being non-nil?
>>
>> I’m aware of this issue, but the truth is there isn’t a good solution to
>> it. We need to recognize FOR_EACH_TAIL_SAFE (not hard) and fix arbitrary
>> code after it (hard). In this case it’s a if statement, with macro calls
>> and AND operation in it’s condition, it’s already three things we need
>> to recognize and somehow handle. It can also be a for loop, a switch
>> case, a function call, a while loop. If we want to fix FOR_EACH_TAIL we
>> would need to handle every possible thing, at that point we might as
>> well have wrote a parser :-)
>
> Sorry, I don't understand the difficulties. The body of FOR_EACH_TAIL
> and a few similar macros we use could be on of the following:
>
> . a single simple statement
> . an 'if' clause
> . a 'while' loop
> . a 'do' loop
> . a 'for' loop
> . a brace-delimited block (this one already works, AFAICS, so we
> perhaps need not anything about it)
>
> (In practice, only the first 2 and the last one are used, AFAICS.)
>
> Doesn't tree-sitter tell us enough to figure out which of the above is
> in the body? If so, why would we need to write a full parser?
Ok, the full parser part is a bit exaggeration :-) But my point is it’s a lot of dirty code for a subpar result. Take
if (NILP (XCDR (tail)) && STRINGP (XCAR (tail)))
for example, it’s parsed as a function definition and all the tokens in the condition are parsed as weird things like macro declarator, error, function declarator, type, etc. And if the condition changes to something else, say it has another layer of function call, it’ll parse differently.
>
>> We can probably fix this very particular case, but it’s still work and
>> overhead, and doesn’t mean much.
>
> Please understand: good support for editing Emacs C sources is from my
> POV imperative for c-ts-mode to gain traction. One of my gripes about
> CC Mode was insufficient support for our macro system and for various
> GCC attributes; that improved recently to some extend, but not enough,
> and at a price of introducing ugly lists of strings that cause trouble
> when used in file-local variables. We must do better in c-ts-mode!
>
> So please make an effort of providing reasonable built-in solutions
> for these idiosyncrasies of the Emacs C sources, conditioned on the
> new variable c-ts-mode-emacs-sources-support, at least for those of
> them that are used widely enough. It is very important.
What do you think of extending the parser to support these macros instead? (So we fork tree-sitter-c.) If we can fix the parser, we don’t need to retrofit hacks onto font-lock, indent, etc, separately, and it truly fixes the problem. The downside is compiling from grammar source to grammar.c needs rust and node tools. But I guess depending on the grammar maintained by tree-sitter’s author isn’t too much different from depending on the grammar maintained by another individual (ie, me)?
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62951
; Package
emacs
.
(Sun, 23 Apr 2023 06:25:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 62951 <at> debbugs.gnu.org (full text, mbox):
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Sat, 22 Apr 2023 17:28:25 -0700
> Cc: 62951 <at> debbugs.gnu.org
>
> >> I’m aware of this issue, but the truth is there isn’t a good solution to
> >> it. We need to recognize FOR_EACH_TAIL_SAFE (not hard) and fix arbitrary
> >> code after it (hard). In this case it’s a if statement, with macro calls
> >> and AND operation in it’s condition, it’s already three things we need
> >> to recognize and somehow handle. It can also be a for loop, a switch
> >> case, a function call, a while loop. If we want to fix FOR_EACH_TAIL we
> >> would need to handle every possible thing, at that point we might as
> >> well have wrote a parser :-)
> >
> > Sorry, I don't understand the difficulties. The body of FOR_EACH_TAIL
> > and a few similar macros we use could be on of the following:
> >
> > . a single simple statement
> > . an 'if' clause
> > . a 'while' loop
> > . a 'do' loop
> > . a 'for' loop
> > . a brace-delimited block (this one already works, AFAICS, so we
> > perhaps need not anything about it)
> >
> > (In practice, only the first 2 and the last one are used, AFAICS.)
> >
> > Doesn't tree-sitter tell us enough to figure out which of the above is
> > in the body? If so, why would we need to write a full parser?
>
> Ok, the full parser part is a bit exaggeration :-) But my point is it’s a lot of dirty code for a subpar result. Take
>
> if (NILP (XCDR (tail)) && STRINGP (XCAR (tail)))
>
> for example, it’s parsed as a function definition and all the tokens in the condition are parsed as weird things like macro declarator, error, function declarator, type, etc. And if the condition changes to something else, say it has another layer of function call, it’ll parse differently.
But the top-level construct is 'if', no? Isn't that enough?
Also, can we detect the FOR_EACH_TAIL etc. macros themselves, and then
treat their body specially?
> > Please understand: good support for editing Emacs C sources is from my
> > POV imperative for c-ts-mode to gain traction. One of my gripes about
> > CC Mode was insufficient support for our macro system and for various
> > GCC attributes; that improved recently to some extend, but not enough,
> > and at a price of introducing ugly lists of strings that cause trouble
> > when used in file-local variables. We must do better in c-ts-mode!
> >
> > So please make an effort of providing reasonable built-in solutions
> > for these idiosyncrasies of the Emacs C sources, conditioned on the
> > new variable c-ts-mode-emacs-sources-support, at least for those of
> > them that are used widely enough. It is very important.
>
> What do you think of extending the parser to support these macros instead? (So we fork tree-sitter-c.)
This goes against the purpose of using tree-sitter and its grammars.
I don't think we should maintain and develop language grammars,
especially since the production of the C sources from them needs
non-trivial system resources and additional tools.
Maybe coming up with a way of extending the C grammar in some more
general way, and then submitting the changes to the tree-sitter-c
developers for inclusion would be OK.
But I very much hope that we could support these macros at a lower
cost, by some tailored Lisp. Please give it a try. Even if the
result works only for the cases we actually use in our sources, it
might be "good enough" for us.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62951
; Package
emacs
.
(Sun, 23 Apr 2023 21:05:01 GMT)
Full text and
rfc822 format available.
Message #20 received at 62951 <at> debbugs.gnu.org (full text, mbox):
On 23/04/2023 03:28, Yuan Fu wrote:
> What do you think of extending the parser to support these macros instead? (So we fork tree-sitter-c.) If we can fix the parser, we don’t need to retrofit hacks onto font-lock, indent, etc, separately, and it truly fixes the problem. The downside is compiling from grammar source to grammar.c needs rust and node tools. But I guess depending on the grammar maintained by tree-sitter’s author isn’t too much different from depending on the grammar maintained by another individual (ie, me)?
We had also talked at some point about replacing the actual text that
the parser sees with something else.
If this can be done in a straightforward way (with tracking the
subsequent correspondence of "real" text back to the buffer for syntax
highlighting), that might be the perfect solution: we'd have a defcustom
which would hold a list of macros used in the current codebase in the
form of templates, and we'd set a bunch of them in emacs/.dir-locals.el.
I'm not sure how difficult this is to implement and maintain, but it's
probably going to be less work to maintain than a fork of the grammar.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62951
; Package
emacs
.
(Mon, 24 Apr 2023 07:03:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 62951 <at> debbugs.gnu.org (full text, mbox):
> On Apr 22, 2023, at 11:25 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Sat, 22 Apr 2023 17:28:25 -0700
>> Cc: 62951 <at> debbugs.gnu.org
>>
>>>> I’m aware of this issue, but the truth is there isn’t a good solution to
>>>> it. We need to recognize FOR_EACH_TAIL_SAFE (not hard) and fix arbitrary
>>>> code after it (hard). In this case it’s a if statement, with macro calls
>>>> and AND operation in it’s condition, it’s already three things we need
>>>> to recognize and somehow handle. It can also be a for loop, a switch
>>>> case, a function call, a while loop. If we want to fix FOR_EACH_TAIL we
>>>> would need to handle every possible thing, at that point we might as
>>>> well have wrote a parser :-)
>>>
>>> Sorry, I don't understand the difficulties. The body of FOR_EACH_TAIL
>>> and a few similar macros we use could be on of the following:
>>>
>>> . a single simple statement
>>> . an 'if' clause
>>> . a 'while' loop
>>> . a 'do' loop
>>> . a 'for' loop
>>> . a brace-delimited block (this one already works, AFAICS, so we
>>> perhaps need not anything about it)
>>>
>>> (In practice, only the first 2 and the last one are used, AFAICS.)
>>>
>>> Doesn't tree-sitter tell us enough to figure out which of the above is
>>> in the body? If so, why would we need to write a full parser?
>>
>> Ok, the full parser part is a bit exaggeration :-) But my point is it’s a lot of dirty code for a subpar result. Take
>>
>> if (NILP (XCDR (tail)) && STRINGP (XCAR (tail)))
>>
>> for example, it’s parsed as a function definition and all the tokens in the condition are parsed as weird things like macro declarator, error, function declarator, type, etc. And if the condition changes to something else, say it has another layer of function call, it’ll parse differently.
>
> But the top-level construct is 'if', no? Isn't that enough?
>
> Also, can we detect the FOR_EACH_TAIL etc. macros themselves, and then
> treat their body specially?
>
>>> Please understand: good support for editing Emacs C sources is from my
>>> POV imperative for c-ts-mode to gain traction. One of my gripes about
>>> CC Mode was insufficient support for our macro system and for various
>>> GCC attributes; that improved recently to some extend, but not enough,
>>> and at a price of introducing ugly lists of strings that cause trouble
>>> when used in file-local variables. We must do better in c-ts-mode!
>>>
>>> So please make an effort of providing reasonable built-in solutions
>>> for these idiosyncrasies of the Emacs C sources, conditioned on the
>>> new variable c-ts-mode-emacs-sources-support, at least for those of
>>> them that are used widely enough. It is very important.
>>
>> What do you think of extending the parser to support these macros instead? (So we fork tree-sitter-c.)
>
> This goes against the purpose of using tree-sitter and its grammars.
> I don't think we should maintain and develop language grammars,
> especially since the production of the C sources from them needs
> non-trivial system resources and additional tools.
>
> Maybe coming up with a way of extending the C grammar in some more
> general way, and then submitting the changes to the tree-sitter-c
> developers for inclusion would be OK.
>
> But I very much hope that we could support these macros at a lower
> cost, by some tailored Lisp. Please give it a try. Even if the
> result works only for the cases we actually use in our sources, it
> might be "good enough" for us.
Fair enough, I’ll try :-)
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62951
; Package
emacs
.
(Wed, 26 Apr 2023 22:20:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 62951 <at> debbugs.gnu.org (full text, mbox):
> On Apr 23, 2023, at 2:04 PM, Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>
> On 23/04/2023 03:28, Yuan Fu wrote:
>> What do you think of extending the parser to support these macros instead? (So we fork tree-sitter-c.) If we can fix the parser, we don’t need to retrofit hacks onto font-lock, indent, etc, separately, and it truly fixes the problem. The downside is compiling from grammar source to grammar.c needs rust and node tools. But I guess depending on the grammar maintained by tree-sitter’s author isn’t too much different from depending on the grammar maintained by another individual (ie, me)?
>
> We had also talked at some point about replacing the actual text that the parser sees with something else.
>
> If this can be done in a straightforward way (with tracking the subsequent correspondence of "real" text back to the buffer for syntax highlighting), that might be the perfect solution: we'd have a defcustom which would hold a list of macros used in the current codebase in the form of templates, and we'd set a bunch of them in emacs/.dir-locals.el.
>
> I'm not sure how difficult this is to implement and maintain, but it's probably going to be less work to maintain than a fork of the grammar.
Sounds to me a bit difficult to write. Eg, translating between tree-sitter position and buffer position efficiently isn’t too easy. Now plus narrowing, and what if the narrowing boundary is in the middle of a replace region?
My idea right now is to use the range feature in tree-sitter. Since the “body” of FOR_EACH_TAIL is valid C, I can either set the ranges for the parser so it ignores FOR_EACH_TAIL, or I can add another parser that only parses the body of FOR_EACH_TAIL.
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62951
; Package
emacs
.
(Thu, 27 Apr 2023 03:16:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 62951 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
>
> My idea right now is to use the range feature in tree-sitter. Since the “body” of FOR_EACH_TAIL is valid C, I can either set the ranges for the parser so it ignores FOR_EACH_TAIL
I end up going this route.
> or I can add another parser that only parses the body of FOR_EACH_TAIL.
Ok, here’s the patch. Eli, would you give it a try? This is a sizable patch so I’m not sure if you want it on emacs-29 or master.
Yuan
[for-each-tail-fix.patch (application/octet-stream, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62951
; Package
emacs
.
(Thu, 27 Apr 2023 08:59:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 62951 <at> debbugs.gnu.org (full text, mbox):
On 27/04/2023 01:19, Yuan Fu wrote:
>
>
>> On Apr 23, 2023, at 2:04 PM, Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>>
>> On 23/04/2023 03:28, Yuan Fu wrote:
>>> What do you think of extending the parser to support these macros instead? (So we fork tree-sitter-c.) If we can fix the parser, we don’t need to retrofit hacks onto font-lock, indent, etc, separately, and it truly fixes the problem. The downside is compiling from grammar source to grammar.c needs rust and node tools. But I guess depending on the grammar maintained by tree-sitter’s author isn’t too much different from depending on the grammar maintained by another individual (ie, me)?
>>
>> We had also talked at some point about replacing the actual text that the parser sees with something else.
>>
>> If this can be done in a straightforward way (with tracking the subsequent correspondence of "real" text back to the buffer for syntax highlighting), that might be the perfect solution: we'd have a defcustom which would hold a list of macros used in the current codebase in the form of templates, and we'd set a bunch of them in emacs/.dir-locals.el.
>>
>> I'm not sure how difficult this is to implement and maintain, but it's probably going to be less work to maintain than a fork of the grammar.
>
> Sounds to me a bit difficult to write. Eg, translating between tree-sitter position and buffer position efficiently isn’t too easy. Now plus narrowing, and what if the narrowing boundary is in the middle of a replace region?
When the match isn't full, the replacement wouldn't be performed. Same
as with macro name that isn't fully typed out yet.
Yeah, it does seem like a lot of work, but the result might be that
everybody's macros could be supported.
I'm definitely not volunteering, though, so please take this as just a
suggestion.
> My idea right now is to use the range feature in tree-sitter. Since the “body” of FOR_EACH_TAIL is valid C, I can either set the ranges for the parser so it ignores FOR_EACH_TAIL, or I can add another parser that only parses the body of FOR_EACH_TAIL.
Sounds good. Especially for Emacs 29 (maybe).
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62951
; Package
emacs
.
(Thu, 27 Apr 2023 15:04:02 GMT)
Full text and
rfc822 format available.
Message #35 received at 62951 <at> debbugs.gnu.org (full text, mbox):
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Wed, 26 Apr 2023 20:14:45 -0700
> Cc: Eli Zaretskii <eliz <at> gnu.org>,
> 62951 <at> debbugs.gnu.org
>
> Ok, here’s the patch. Eli, would you give it a try?
It signals an error when I enable c-ts-mode:
c-ts-mode: Cannot load language definition: not-found, ("libtree-sitter-emacs-c" "libtree-sitter-emacs-c.dll"), "No such file or directory"
It looks like your "fake emacs-c language" trick somehow misfires?
The value of treesit-load-name-override-list is nil, which is not what
you intended, AFAICT? The only way I can make this work is by
manually customizing treesit-load-name-override-list before loading
c-ts-mode.
Otherwise, looks quite good; here are some other problems I found:
. some uses of FOR_EACH_TAIL are not fontified at all; examples:
comp.c, line 2079, fns.c, line 189
. FOR_EACH_LIVE_BUFFER (data.c, line 1430) is not recognized?
. FOR_EACH_FRAME (keyboard.c, line 1256) is not recognized?
This is much better than before already, so I think you should install
this on the emacs-29 branch, once you fix the above problems (assuming
they are easily fixable).
Thanks!
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62951
; Package
emacs
.
(Thu, 27 Apr 2023 19:57:02 GMT)
Full text and
rfc822 format available.
Message #38 received at 62951 <at> debbugs.gnu.org (full text, mbox):
> On Apr 27, 2023, at 8:03 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Wed, 26 Apr 2023 20:14:45 -0700
>> Cc: Eli Zaretskii <eliz <at> gnu.org>,
>> 62951 <at> debbugs.gnu.org
>>
>> Ok, here’s the patch. Eli, would you give it a try?
>
> It signals an error when I enable c-ts-mode:
>
> c-ts-mode: Cannot load language definition: not-found, ("libtree-sitter-emacs-c" "libtree-sitter-emacs-c.dll"), "No such file or directory"
>
> It looks like your "fake emacs-c language" trick somehow misfires?
> The value of treesit-load-name-override-list is nil, which is not what
> you intended, AFAICT? The only way I can make this work is by
> manually customizing treesit-load-name-override-list before loading
> c-ts-mode.
Duh, sorry, dumb mistake. Fixed.
>
> Otherwise, looks quite good; here are some other problems I found:
>
> . some uses of FOR_EACH_TAIL are not fontified at all; examples:
> comp.c, line 2079, fns.c, line 189
You mean the FOE_EACH_TAIL part isn’t fontified, or the body isn’t fontified? Because the body are always fontified here. FOR_EACH_TAIL itself shouldn’t be fontified since it’s just a macro call and a variable.
> . FOR_EACH_LIVE_BUFFER (data.c, line 1430) is not recognized?
> . FOR_EACH_FRAME (keyboard.c, line 1256) is not recognized?
Didn’t know that they exited :-) Now I have FOR_EACH_TAIL, FOR_EACH_TAIL_SAFE, FOR_EACH_ALIST_VALUE, FOR_EACH_LIVE_BUFFER, FOR_EACH_FRAME.
>
> This is much better than before already, so I think you should install
> this on the emacs-29 branch, once you fix the above problems (assuming
> they are easily fixable).
Cool, I pushed the change.
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62951
; Package
emacs
.
(Fri, 28 Apr 2023 05:42:02 GMT)
Full text and
rfc822 format available.
Message #41 received at 62951 <at> debbugs.gnu.org (full text, mbox):
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Thu, 27 Apr 2023 12:56:07 -0700
> Cc: dmitry <at> gutov.dev,
> 62951 <at> debbugs.gnu.org
>
> > c-ts-mode: Cannot load language definition: not-found, ("libtree-sitter-emacs-c" "libtree-sitter-emacs-c.dll"), "No such file or directory"
> >
> > It looks like your "fake emacs-c language" trick somehow misfires?
> > The value of treesit-load-name-override-list is nil, which is not what
> > you intended, AFAICT? The only way I can make this work is by
> > manually customizing treesit-load-name-override-list before loading
> > c-ts-mode.
>
> Duh, sorry, dumb mistake. Fixed.
Thanks.
> > Otherwise, looks quite good; here are some other problems I found:
> >
> > . some uses of FOR_EACH_TAIL are not fontified at all; examples:
> > comp.c, line 2079, fns.c, line 189
>
> You mean the FOE_EACH_TAIL part isn’t fontified, or the body isn’t fontified? Because the body are always fontified here. FOR_EACH_TAIL itself shouldn’t be fontified since it’s just a macro call and a variable.
I mean the macro itself, FOR_EACH_TAIL. If it isn't supposed to be
fontified, then why is it fontified at line 856 of comp.c? It's
inconsistent. (However, this is a very minor problem, so if fixing
it is difficult, we can leave this alone for now.)
> > . FOR_EACH_LIVE_BUFFER (data.c, line 1430) is not recognized?
> > . FOR_EACH_FRAME (keyboard.c, line 1256) is not recognized?
>
> Didn’t know that they exited :-) Now I have FOR_EACH_TAIL, FOR_EACH_TAIL_SAFE, FOR_EACH_ALIST_VALUE, FOR_EACH_LIVE_BUFFER, FOR_EACH_FRAME.
Great, thanks.
> > This is much better than before already, so I think you should install
> > this on the emacs-29 branch, once you fix the above problems (assuming
> > they are easily fixable).
>
> Cool, I pushed the change.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62951
; Package
emacs
.
(Sat, 29 Apr 2023 22:56:02 GMT)
Full text and
rfc822 format available.
Message #44 received at 62951 <at> debbugs.gnu.org (full text, mbox):
> On Apr 27, 2023, at 10:41 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Thu, 27 Apr 2023 12:56:07 -0700
>> Cc: dmitry <at> gutov.dev,
>> 62951 <at> debbugs.gnu.org
>>
>>> c-ts-mode: Cannot load language definition: not-found, ("libtree-sitter-emacs-c" "libtree-sitter-emacs-c.dll"), "No such file or directory"
>>>
>>> It looks like your "fake emacs-c language" trick somehow misfires?
>>> The value of treesit-load-name-override-list is nil, which is not what
>>> you intended, AFAICT? The only way I can make this work is by
>>> manually customizing treesit-load-name-override-list before loading
>>> c-ts-mode.
>>
>> Duh, sorry, dumb mistake. Fixed.
>
> Thanks.
>
>>> Otherwise, looks quite good; here are some other problems I found:
>>>
>>> . some uses of FOR_EACH_TAIL are not fontified at all; examples:
>>> comp.c, line 2079, fns.c, line 189
>>
>> You mean the FOE_EACH_TAIL part isn’t fontified, or the body isn’t fontified? Because the body are always fontified here. FOR_EACH_TAIL itself shouldn’t be fontified since it’s just a macro call and a variable.
>
> I mean the macro itself, FOR_EACH_TAIL. If it isn't supposed to be
> fontified, then why is it fontified at line 856 of comp.c? It's
> inconsistent. (However, this is a very minor problem, so if fixing
> it is difficult, we can leave this alone for now.)
Ah, I see. FOR_EACH_TAIL’s that has a bracketed body are not skipped and still have fontification on them. I pushed a change so that no one has fontification now. Also thanks for the doc fix :-)
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62951
; Package
emacs
.
(Sun, 30 Apr 2023 05:24:02 GMT)
Full text and
rfc822 format available.
Message #47 received at 62951 <at> debbugs.gnu.org (full text, mbox):
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Sat, 29 Apr 2023 15:55:26 -0700
> Cc: Dmitry Gutov <dmitry <at> gutov.dev>,
> 62951 <at> debbugs.gnu.org
>
> > I mean the macro itself, FOR_EACH_TAIL. If it isn't supposed to be
> > fontified, then why is it fontified at line 856 of comp.c? It's
> > inconsistent. (However, this is a very minor problem, so if fixing
> > it is difficult, we can leave this alone for now.)
>
> Ah, I see. FOR_EACH_TAIL’s that has a bracketed body are not skipped and still have fontification on them. I pushed a change so that no one has fontification now.
Thanks, looks good now.
bug marked as fixed in version 29.1, send any further explanations to
62951 <at> debbugs.gnu.org and Eli Zaretskii <eliz <at> gnu.org>
Request was from
Eli Zaretskii <eliz <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Sun, 30 Apr 2023 05:26:01 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sun, 28 May 2023 11:24:05 GMT)
Full text and
rfc822 format available.
This bug report was last modified 2 years and 26 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.