GNU bug report logs - #62717
29.0.60; c-ts-mode does not indent the first line in a function after RET

Previous Next

Package: emacs;

Reported by: Daniel Martín <mardani29 <at> yahoo.es>

Date: Fri, 7 Apr 2023 19:50:01 UTC

Severity: normal

Found in version 29.0.60

Full log


Message #14 received at 62717 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Daniel Martín <mardani29 <at> yahoo.es>
Cc: Theodor Thornhill <theo <at> thornhill.no>, Yuan Fu <casouri <at> gmail.com>,
 62717 <at> debbugs.gnu.org, João Távora <joaotavora <at> gmail.com>,
 Alan Mackenzie <acm <at> muc.de>
Subject: Re: bug#62717: 29.0.60; c-ts-mode does not indent the first line in a
 function after RET
Date: Sun, 9 Apr 2023 03:20:23 +0300
On 08/04/2023 21:37, Daniel Martín wrote:
> Dmitry Gutov<dmitry <at> gutov.dev>  writes:
> 
>> I've looked at what nvim-treesitter does for indentation, and at least
>> one of the steps looks like this:
>>
>> https://github.com/nvim-treesitter/nvim-treesitter/blob/584ccea56e2d37b31ba292da2b539e1a4bb411ca/lua/nvim-treesitter/indent.lua#L129-L134
>>
>> If the current line is empty, look at the end of the previous line and
>> compute based on the node there.
>>
>> I'm not sure how this meshes with the fact that tree-sitter inserts a
>> "virtual" closer node at the end of the previous line, but the
>> approach is worth examining.
>>
>> Daniel, you posted about testing nvim-treesitter with several
>> scenarios. Does it do the right thing with this one?
> Yes, it works well in this scenario.  Inserting a new line automatically
> adds indentation.

All right.

From reading the code, it looks like a semi-coincidence that this 
example works fine: the algorithm is just different, looking for 
indent/dedent nodes, there is nothing similar to our logic in 
treesit--indent-1. Which can be good and bad, but it's likely that the 
grammar (and tree-sitter itself) co-evolved together with that approach, 
so it's no surprise the sharp edges match.

In particular, the virtual closer node seems to be skipped because the 
search uses descendant_for_range, which seems to jump over zero-length 
nodes.

We could try to hammer in that exception as a workaround, but the 
resulting PARENT won't contain BOL anyway, and it's not 100% clear how 
these fake nodes will look for other grammars. Indentation in 
ruby-ts-mode, for example, won't magically start working right away in 
the comparable situation (method definition without closer) because 
there is also a missing body_statement node, requiring further changes 
to indentation rules.

What does this mean for us? Short of reimplementing nvim-treesitter's 
algorithm (and I haven't read Atom's or Zed's indentation code; 
anybody's welcome to chime in with a summary of either), we could just 
install the patch at the end of this message: it fixes this particular 
case, in a bit hackish way, but at least it doesn't affect other languages.

Note that it still doesn't fix very similar cases, e.g.

  int main () {
    for (;;) {<RET>

(we need additional rules looking for ERROR nodes, like in nvim's 
indent.scm), but in does fix

  int main () {
    for (;;) {}<RET>

and

  int main () {
    int foo;<RET>

I'm not sure, though, what is the big deal with adding the top-level 
function's closing curly first thing before writing the body (after that 
the parser starts working much better), so as far as I'm concerned this 
patch is very optional. It does add some complexity, after all.

Adding Alan and Joao, who were interested in this scenario as well.

diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
index 981c7766375..9aaa8b32c73 100644
--- a/lisp/progmodes/c-ts-mode.el
+++ b/lisp/progmodes/c-ts-mode.el
@@ -859,6 +859,18 @@ c-ts-mode--defun-skipper
     (goto-char (match-end 0)))
   (treesit-default-defun-skipper))

+(defun c-ts-base--before-indent (args)
+  (pcase-let ((`(,node ,parent ,bol) args))
+    (when (null node)
+      (let ((smallest-node (treesit-node-at (point))))
+        ;; "Virtual" closer curly added by the
+        ;; parser's error recovery.
+        (when (and (equal (treesit-node-type smallest-node) "}")
+                   (equal (treesit-node-end smallest-node)
+                          (treesit-node-start smallest-node)))
+          (setq parent (treesit-node-parent smallest-node)))))
+    (list node parent bol)))
+
 (defun c-ts-mode-indent-defun ()
   "Indent the current top-level declaration syntactically.

@@ -904,6 +916,8 @@ c-ts-base-mode
   ;; function_definitions, so we need to find the top-level node.
   (setq-local treesit-defun-prefer-top-level t)

+  (add-function :filter-args treesit-indent-function 
#'c-ts-base--before-indent)
+
   ;; Indent.
   (when (eq c-ts-mode-indent-style 'linux)
     (setq-local indent-tabs-mode t))





This bug report was last modified 2 years and 66 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.