GNU bug report logs -
#61208
29.0.60; treesit-beginning/end-of-defun problem with macros in c-ts-mode
Previous Next
To reply to this bug, email your comments to 61208 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#61208
; Package
emacs
.
(Wed, 01 Feb 2023 09:14:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
yingchao.yang <at> seaboxdata.com
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Wed, 01 Feb 2023 09:14:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
#define SWITCH()
#define CASE(name) case name:
void func(int i) // LINE_E
{
SWITCH(i) // LINE_D
{
CASE(A) // LINE_C
{
;
}
CASE(B) // LINE_B
{
; // LINE_A
}
}
}
When cursor is at LINE_A, and stoke `C-M-a`, cursor will go to LINE_B;
then `C-M-a` again, cursor goes to LINE_C, then `C-M-a` again, LINE_D,
and `C-M-a` again, finally to LINE_E...
On Wed, Feb 01 2023, Yang Yingchao <yang.yingchao <at> qq.com> wrote:
> Forgive me the mess...
>
>
> src
>
>
> On Wed, Feb 01 2023, Yang Yingchao <yang.yingchao <at> qq.com> wrote:
>
>> Hi **,
>>
>> From: Yang Yingchao <yang.yingchao <at> qq.com> Reply-To: yang.yingchao <at> qq.com
>> Date: Wed, 01 Feb 2023 14:19:30 +0800 Cc: yang.yingchao <at> qq.com To:
>> bug-gnu-emacs <at> gnu.org Subject: 29.0.60; treesit-beginning/end-of-defun problem
>> with macros in c-ts-mode –text follows this line–
>>
>> treesit-beginning/end-of-defun in c-ts-mode not work correctly with macros.
>>
>> For example, in the following codes:
>>
>> #define SWITCH() #define CASE(name) case name:
>>
>> void func(int i) / LINE_E { SWITCH(i) / LINE_D { CASE(A) / LINE_C { ; } CASE
>> (B) / LINE_B { ; // LINE_A } } }
>>
>> When cursor is at LINE_A, and stoke `C-M-a`, cursor will go to LINE_B; then
>> `C-M-a` again, cursor goes to LINE_C, then `C-M-a` again, LINE_D, and `C-M-a`
>> again, finally to LINE_E…
>>
>> Regards…
>>
>> In GNU Emacs 29.0.60 (build 15, x86_64-pc-linux-gnu, GTK+ Version 3.24.35,
>> cairo version 1.17.6) of 2023-02-01 built on tbook Repository revision:
>> c345ec43995051e3fb412cfb8f24d0e931b7de5e Repository branch: yc-hacking System
>> Description: Gentoo Linux
>>
>> Configured using: 'configure 'CFLAGS=-O2 -march=native -pipe -g' LDFLAGS= –
>> with-native-compilation –without-pop –without-imagemagick –with-xml2 –
>> with-json –with-modules –with-pgtk'
>>
>> Configured features: ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GSETTINGS
>> HARFBUZZ JPEG JSON LCMS2 LIBSYSTEMD LIBXML2 MODULES NATIVE_COMP NOTIFY INOTIFY
>> PDUMPER PGTK PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS
>> TREE_SITTER WEBP XIM GTK3 ZLIB
>>
>> Important settings: value of $LANG: zh_CN.UTF8 value of $XMODIFIERS: @im=fcitx
>> locale-coding-system: utf-8-unix
>>
>> Major mode: C
>>
>> Minor modes in effect: tooltip-mode: t global-eldoc-mode: t show-paren-mode: t
>> electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t
>> file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t
>> blink-cursor-mode: t line-number-mode: t indent-tabs-mode: t
>> transient-mark-mode: t auto-composition-mode: t auto-encryption-mode: t
>> auto-compression-mode: t
>>
>> Load-path shadows: None found.
>>
>> Features: (shadow sort mail-extr emacsbug message mailcap yank-media puny
>> dired dired-loaddefs rfc822 mml mml-sec password-cache epa derived epg rfc6068
>> epg-config gnus-util text-property-search time-date mm-decode mm-bodies
>> mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047
>> rfc2045 ietf-drums mm-util mail-prsvr mail-utils c-ts-mode c-ts-common treesit
>> pp cl-print byte-opt thingatpt help-fns radix-tree cc-mode cc-fonts cc-guess
>> cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs cl-loaddefs comp
>> comp-cstr warnings icons subr-x rx cl-seq cl-macs gv cl-extra help-mode
>> bytecomp byte-compile cl-lib china-util rmc iso-transl tooltip cconv eldoc
>> paren electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel
>> term/pgtk-win pgtk-win term/common-win pgtk-dnd tool-bar dnd fontset image
>> regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode
>> prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu timer
>> select scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors
>> frame minibuffer nadvice seq simple cl-generic indonesian philippine cham
>> georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
>> japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic
>> indian cyrillic chinese composite emoji-zwj charscript charprop case-table
>> epa-hook jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button
>> loaddefs theme-loaddefs faces cus-face macroexp files window text-properties
>> overlay sha1 md5 base64 format env code-pages mule custom widget keymap
>> hashtable-print-readable backquote threads dbusbind inotify dynamic-setting
>> system-font-setting font-render-setting cairo gtk pgtk lcms2 multi-tty
>> make-network-process native-compile emacs)
>>
>> Memory information: ((conses 16 116552 13400) (symbols 48 9439 0) (strings 32
>> 29132 1837) (string-bytes 1 934955) (vectors 16 19030) (vector-slots 8 379928
>> 16623) (floats 8 36 34) (intervals 56 432 0) (buffers 984 14))
>>
>> – Yang Yingchao Yang Yingchao
Yang Yingchao
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#61208
; Package
emacs
.
(Wed, 01 Feb 2023 12:51:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 61208 <at> debbugs.gnu.org (full text, mbox):
> Cc: yang.yingchao <at> qq.com
> Date: Wed, 01 Feb 2023 14:33:24 +0800
> From: Yang Yingchao via "Bug reports for GNU Emacs,
> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
>
>
> #define SWITCH()
> #define CASE(name) case name:
>
> void func(int i) // LINE_E
> {
> SWITCH(i) // LINE_D
> {
> CASE(A) // LINE_C
> {
> ;
> }
> CASE(B) // LINE_B
> {
> ; // LINE_A
> }
> }
> }
>
> When cursor is at LINE_A, and stoke `C-M-a`, cursor will go to LINE_B;
> then `C-M-a` again, cursor goes to LINE_C, then `C-M-a` again, LINE_D,
> and `C-M-a` again, finally to LINE_E...
Set treesit-defun-tactic to 'top-level, and your problem is solved.
Yuan, Theo: do we want to have that set by default in ts-c-mode? C
doesn't have nested functions, so it should be a better default, what
with all the cpp madness that the C grammar doesn't grok.
Maybe also in C++ and Java -- AFAIU they don't have nested functions
either.
WDYT?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#61208
; Package
emacs
.
(Wed, 01 Feb 2023 13:11:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 61208 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
>> Cc: yang.yingchao <at> qq.com
>> Date: Wed, 01 Feb 2023 14:33:24 +0800
>> From: Yang Yingchao via "Bug reports for GNU Emacs,
>> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
>>
>>
>> #define SWITCH()
>> #define CASE(name) case name:
>>
>> void func(int i) // LINE_E
>> {
>> SWITCH(i) // LINE_D
>> {
>> CASE(A) // LINE_C
>> {
>> ;
>> }
>> CASE(B) // LINE_B
>> {
>> ; // LINE_A
>> }
>> }
>> }
>>
>> When cursor is at LINE_A, and stoke `C-M-a`, cursor will go to LINE_B;
>> then `C-M-a` again, cursor goes to LINE_C, then `C-M-a` again, LINE_D,
>> and `C-M-a` again, finally to LINE_E...
>
> Set treesit-defun-tactic to 'top-level, and your problem is solved.
>
> Yuan, Theo: do we want to have that set by default in ts-c-mode? C
> doesn't have nested functions, so it should be a better default, what
> with all the cpp madness that the C grammar doesn't grok.
>
> Maybe also in C++ and Java -- AFAIU they don't have nested functions
> either.
>
> WDYT?
I'm fine with that change, I think. Other, "smaller" constructs can be
found as sentences or sexps anyway, I think.
Theo
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#61208
; Package
emacs
.
(Thu, 02 Feb 2023 02:33:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 61208 <at> debbugs.gnu.org (full text, mbox):
> On Feb 1, 2023, at 4:49 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
>> Cc: yang.yingchao <at> qq.com
>> Date: Wed, 01 Feb 2023 14:33:24 +0800
>> From: Yang Yingchao via "Bug reports for GNU Emacs,
>> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
>>
>>
>> #define SWITCH()
>> #define CASE(name) case name:
>>
>> void func(int i) // LINE_E
>> {
>> SWITCH(i) // LINE_D
>> {
>> CASE(A) // LINE_C
>> {
>> ;
>> }
>> CASE(B) // LINE_B
>> {
>> ; // LINE_A
>> }
>> }
>> }
>>
>> When cursor is at LINE_A, and stoke `C-M-a`, cursor will go to LINE_B;
>> then `C-M-a` again, cursor goes to LINE_C, then `C-M-a` again, LINE_D,
>> and `C-M-a` again, finally to LINE_E...
>
> Set treesit-defun-tactic to 'top-level, and your problem is solved.
>
> Yuan, Theo: do we want to have that set by default in ts-c-mode? C
> doesn't have nested functions, so it should be a better default, what
> with all the cpp madness that the C grammar doesn't grok.
>
> Maybe also in C++ and Java -- AFAIU they don't have nested functions
> either.
Treesit-defun-tactic being ’nested isn’t the problem here, at least not the direct cause of the problem. c-ts-mode doesn’t consider switch cases or if-else statements as defuns. It only considers function, struct, enum, union, as defun. So in a preprocessed C source file, C-M-a will move point to the beginning of the function, line E. It does not in this particular file because tree-sitter is thrown off by the SWITCH() and CASE() macro: it can’t tell what they are and parses them as function definitions.
I don’t object setting treesit-defun-tactic to ’top-level in c-ts-mode, though. It can hide problems like this. Just be aware that it merely hides the problem.
C++ and Java has classes, and when point is in a class, I think people expect to move to the prev/next method rather than the beginning/end of the class. So nested is still a better default IMO.
Yuan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#61208
; Package
emacs
.
(Thu, 02 Feb 2023 06:46:01 GMT)
Full text and
rfc822 format available.
Message #17 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Thu, Feb 02 2023, Yang Yingchao <yang.yingchao <at> qq.com> wrote:
> On Wed, Feb 01 2023, Theodor Thornhill <theo <at> thornhill.no> wrote:
>
>> Eli Zaretskii <eliz <at> gnu.org> writes:
>>
>>>> Cc: yang.yingchao <at> qq.com
>>>> Date: Wed, 01 Feb 2023 14:33:24 +0800
>>>> From: Yang Yingchao via "Bug reports for GNU Emacs,
>>>> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
>>>>
>>>>
>>>> #define SWITCH()
>>>> #define CASE(name) case name:
>>>>
>>>> void func(int i) // LINE_E
>>>> {
>>>> SWITCH(i) // LINE_D
>>>> {
>>>> CASE(A) // LINE_C
>>>> {
>>>> ;
>>>> }
>>>> CASE(B) // LINE_B
>>>> {
>>>> ; // LINE_A
>>>> }
>>>> }
>>>> }
>>>>
>>>> When cursor is at LINE_A, and stoke `C-M-a`, cursor will go to LINE_B;
>>>> then `C-M-a` again, cursor goes to LINE_C, then `C-M-a` again, LINE_D,
>>>> and `C-M-a` again, finally to LINE_E...
>>>
>>> Set treesit-defun-tactic to 'top-level, and your problem is solved.
>>>
>>> Yuan, Theo: do we want to have that set by default in ts-c-mode? C
>>> doesn't have nested functions, so it should be a better default, what
>>> with all the cpp madness that the C grammar doesn't grok.
>>>
>>> Maybe also in C++ and Java -- AFAIU they don't have nested functions
>>> either.
>>>
>>> WDYT?
>>
>> I'm fine with that change, I think. Other, "smaller" constructs can be
>> found as sentences or sexps anyway, I think.
>>
>> Theo
>
Thanks for the help.
But in the following C++ code, is it possible to make treesit-beginning/end-of-defun behaves the same as c++-mode ?
,----
| class Test // LINE_D
| {
| public:
| Test(int i) // LINE_C
| {
| SWITCH(i)
| {
| CASE(A)
| {
| ;
| }
| CASE(B) // LINE_B
| {
| ; // LINE_A
| }
| }
| }
| };
`----
When cursor is at LINE_A, if in c++-mode, `C-M-a` moves cursor to LINE_C, which is correct.
But in c++-ts-mode, behaviour of `C-M-a` is wrong:
if treesit-defun-tactic is nested, it moves to line_B, and if treesit-defun-tactic is top-level,
it moves to LINE_D. Both of them are actually wrong...
--
Yang Yingchao
Yang Yingchao
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#61208
; Package
emacs
.
(Thu, 02 Feb 2023 06:46:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 61208 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Wed, Feb 01 2023, Theodor Thornhill <theo <at> thornhill.no> wrote:
> Eli Zaretskii <eliz <at> gnu.org> writes:
>
>>> Cc: yang.yingchao <at> qq.com
>>> Date: Wed, 01 Feb 2023 14:33:24 +0800
>>> From: Yang Yingchao via "Bug reports for GNU Emacs,
>>> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
>>>
>>>
>>> #define SWITCH()
>>> #define CASE(name) case name:
>>>
>>> void func(int i) // LINE_E
>>> {
>>> SWITCH(i) // LINE_D
>>> {
>>> CASE(A) // LINE_C
>>> {
>>> ;
>>> }
>>> CASE(B) // LINE_B
>>> {
>>> ; // LINE_A
>>> }
>>> }
>>> }
>>>
>>> When cursor is at LINE_A, and stoke `C-M-a`, cursor will go to LINE_B;
>>> then `C-M-a` again, cursor goes to LINE_C, then `C-M-a` again, LINE_D,
>>> and `C-M-a` again, finally to LINE_E...
>>
>> Set treesit-defun-tactic to 'top-level, and your problem is solved.
>>
>> Yuan, Theo: do we want to have that set by default in ts-c-mode? C
>> doesn't have nested functions, so it should be a better default, what
>> with all the cpp madness that the C grammar doesn't grok.
>>
>> Maybe also in C++ and Java -- AFAIU they don't have nested functions
>> either.
>>
>> WDYT?
>
> I'm fine with that change, I think. Other, "smaller" constructs can be
> found as sentences or sexps anyway, I think.
>
> Theo
Thanks for the help.
But in the following C++ code, is it possible to make treesit-beginning/end-of-defun behaves the same as c++-mode ?
,----
| class Test // LINE_D
| {
| public:
| Test(int i) // LINE_C
| {
| SWITCH(i)
| {
| CASE(A)
| {
| ;
| }
| CASE(B) // LINE_B
| {
| ; // LINE_A
| }
| }
| }
| };
`----
When cursor is at LINE_A, if in c++-mode, `C-M-a` moves cursor to LINE_C, which is correct.
But in c++-ts-mode, behaviour of `C-M-a` is wrong:
if treesit-defun-tactic is nested, it moves to line_B, and if treesit-defun-tactic is top-level,
it moves to LINE_D. Both of them are actually wrong...
--
Yang Yingchao
Yang Yingchao
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#61208
; Package
emacs
.
(Thu, 02 Feb 2023 07:17:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 61208 <at> debbugs.gnu.org (full text, mbox):
> From: Yang Yingchao <yang.yingchao <at> qq.com>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, Yuan Fu <casouri <at> gmail.com>,
> 61208 <at> debbugs.gnu.org
> Date: Thu, 02 Feb 2023 08:48:55 +0800
>
> But in the following C++ code, is it possible to make treesit-beginning/end-of-defun behaves the same as c++-mode ?
>
> ,----
> | class Test // LINE_D
> | {
> | public:
> | Test(int i) // LINE_C
> | {
> | SWITCH(i)
> | {
> | CASE(A)
> | {
> | ;
> | }
> | CASE(B) // LINE_B
> | {
> | ; // LINE_A
> | }
> | }
> | }
> | };
> `----
>
>
> When cursor is at LINE_A, if in c++-mode, `C-M-a` moves cursor to LINE_C, which is correct.
> But in c++-ts-mode, behaviour of `C-M-a` is wrong:
> if treesit-defun-tactic is nested, it moves to line_B, and if treesit-defun-tactic is top-level,
> it moves to LINE_D. Both of them are actually wrong...
I don't necessarily agree that c++-mode is right in this case. I
think it's sheer luck that it goes to where it goes, and small changes
in the cpp macros could easily defeat its logic.
This is all a consequence of the fact that cpp macros that change the
language syntax could have unexpected influence on what the major mode
does with movement by defuns. It is not a coincidence that such usage
of cpp macros is discouraged by modern coding conventions and
recommendations.
From my POV, there's no bug here. There's no requirement that the TS
modes behave the same as their non-TS brethren. One could argue that
we introduced the TS modes precisely _because_ they behave
differently. And where cpp macros are involved, all bets are off to
begin with; good support for them is only possible by teaching the
mode about each and every macro.
So I'm okay with closing this bug as wontfix, unless someone has an
easy way of "fixing" it.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#61208
; Package
emacs
.
(Thu, 02 Feb 2023 07:42:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 61208 <at> debbugs.gnu.org (full text, mbox):
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Wed, 1 Feb 2023 18:32:26 -0800
> Cc: yingchao.yang <at> seaboxdata.com,
> Theodor Thornhill <theo <at> thornhill.no>,
> 61208 <at> debbugs.gnu.org,
> yang.yingchao <at> qq.com
>
> Treesit-defun-tactic being ’nested isn’t the problem here, at least not the direct cause of the problem. c-ts-mode doesn’t consider switch cases or if-else statements as defuns. It only considers function, struct, enum, union, as defun. So in a preprocessed C source file, C-M-a will move point to the beginning of the function, line E. It does not in this particular file because tree-sitter is thrown off by the SWITCH() and CASE() macro: it can’t tell what they are and parses them as function definitions.
>
> I don’t object setting treesit-defun-tactic to ’top-level in c-ts-mode, though. It can hide problems like this. Just be aware that it merely hides the problem.
OK, I think I will make that change soon.
> C++ and Java has classes, and when point is in a class, I think people expect to move to the prev/next method rather than the beginning/end of the class. So nested is still a better default IMO.
OK, I see your point, and I think you are right.
Btw, I noticed that C-M-a in c++-ts-mode goes to the BOL of the line
where the function/class/namespace is declared, whereas c++-mode goes
to the first non-whitespace character on that line. Isn't the
c++-mode way better? If you agree, we should probably change
c++-ts-mode (and maybe also java-ts-mode?) to behave like CC mode, but
we should also make sure that changing this will not adversely affect
"C-c C-q" and "C-M-q". WDYT?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#61208
; Package
emacs
.
(Thu, 02 Feb 2023 18:23:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 61208 <at> debbugs.gnu.org (full text, mbox):
> Cc: yingchao.yang <at> seaboxdata.com, 61208 <at> debbugs.gnu.org, theo <at> thornhill.no,
> yang.yingchao <at> qq.com
> Date: Thu, 02 Feb 2023 09:41:23 +0200
> From: Eli Zaretskii <eliz <at> gnu.org>
>
> > Treesit-defun-tactic being ’nested isn’t the problem here, at least not the direct cause of the problem. c-ts-mode doesn’t consider switch cases or if-else statements as defuns. It only considers function, struct, enum, union, as defun. So in a preprocessed C source file, C-M-a will move point to the beginning of the function, line E. It does not in this particular file because tree-sitter is thrown off by the SWITCH() and CASE() macro: it can’t tell what they are and parses them as function definitions.
> >
> > I don’t object setting treesit-defun-tactic to ’top-level in c-ts-mode, though. It can hide problems like this. Just be aware that it merely hides the problem.
>
> OK, I think I will make that change soon.
Done.
Forcibly Merged 61208 61209.
Request was from
Stefan Kangas <stefankangas <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Sun, 10 Sep 2023 17:22:01 GMT)
Full text and
rfc822 format available.
This bug report was last modified 1 year and 283 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.