GNU bug report logs - #78703
beginning-of-defun and friends still wrong in typescript-ts-mode

Previous Next

Package: emacs;

Reported by: Daniel Colascione <dancol <at> dancol.org>

Date: Thu, 5 Jun 2025 23:41:02 UTC

Severity: normal

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Colascione <dancol <at> dancol.org>
Cc: casouri <at> gmail.com, brownts <at> troybrown.dev, 78703 <at> debbugs.gnu.org
Subject: bug#78703: beginning-of-defun and friends still wrong in typescript-ts-mode
Date: Tue, 10 Jun 2025 15:12:22 +0300
> From: Daniel Colascione <dancol <at> dancol.org>
> Cc: Troy Brown <brownts <at> troybrown.dev>,  Eli Zaretskii <eliz <at> gnu.org>,
>   78703 <at> debbugs.gnu.org
> Date: Tue, 10 Jun 2025 00:24:54 -0700
> 
> If I'm using, say, c++-ts-mode, my navigation commands should do the
> same thing they do in c++-mode.  Plenty of code as well as muscle
> memories rely on this behavior.

I'm not sure I agree.  That c++-mode behaved like that doesn't mean
it's the last word, or that nothing can be improved in that behavior.

In addition, TS-based modes make certain behaviors very hard (at least
not if we base it on the parser information), and OTOH make certain
behaviors very easy that were hard with the "traditional" modes.  So
we should keep an open mind about these aspects, and not automatically
demand 110% compatibility to past behavior.

> The "tactic" concept is an unnecessary layer of indirection.

From where I stand, it's a new feature that was unavailable in non-TS
implementation.

> If operation A takes you to place X and operation B takes you to
> different place Y, the way to express the difference between operations
> A and B is to make them _different commands_, not by twiddling some
> global switch.

But if the semantics of a command is ambiguous, then a switch makes
perfect sense.  In this case, what exactly "beginning of defun" means
when there are nested defuns is ambiguous.

> When you change tree-sitter "strategies" right now, you're silently
> turning one command into another command, that's confusing for everyone.
> Please give these "strategies" individual command names.  We have
> beginning-of-defun and beginning-of-defun-comments, not a knob that
> alters beginning-of-defun.

I don't want to memorize two commands when one will do.  That's why we
have the various optional behaviors of commands and DWIM-ish
variations in their behavior.

> > prog-fill-reindent-defun uses beginning-of-defun because we
> > don’t have better choices before tree-sitter. In tree-sitter major
> > modes, what we’ve been doing is to make the existing commands
> > customizable so tree-sitter can provide a tree-sitter version of
> > it.
> 
> Why does there need to be a tree-sitter version of
> prog-fill-reindent-defun?

Because the way to get the indentation information from tree-sitter is
significantly different from the ad-hoc ways we do that in
"traditional" modes.

> Isn't it enough that tree-sitter provide the
> low-level syntactic analysis for prog-fill-reindent-defun to do its job?
> Why the high level hook?

How do you implement anything like c-set-offset or indentation styles
based only on low-level syntactic analysis? where will the rest of the
necessary information come from, and who and how will apply it?

> > We’ve done this for forward-sexp: we added
> > forward-sexp-function. Some commands already have customization points
> > long ago, like beginning-of-defun, which has
> > beginning-of-defun-function.
> 
> Modes use generally these "customization points" to _implement_ the
> familiar behavior, not to give them random different
> user-visible semantics.

I think the point in the above example was that the semantic of "sexp"
is ambiguous in any language that is not Lisp.  That was (and still
is) the hard part of figuring out how forward-sexp should behave in
TS-based modes.  (In non-TS modes the behavior is just arbitrary
nonsense, if you ask me.)

> > So I added prog-fill-reindent-defun-function and a tree-sitter version
> > treesit-fill-reindent-defun. The tree-sitter implementation uses
> > treesit-defun-at-point, so it doesn’t even need to concern
> > with tactics.
> >
> > Now in tree-sitter major modes, prog-fill-reindent-defun should always
> > indent the enclosing defun.
> 
> Which now means prog-fill-reindent-defun can indent something other than
> what mark-defun highlights?

They did subtly different things since long ago.  It's clearly visible
in the code.

> Tree sitter's job is syntactic analysis, not UI differentiation.

The way we use syntactic information in this commands is a leaky
abstraction: the syntax aspects leak into the UI.  So it is a small
wonder that tree-sitter affects the UI in some (relatively minor)
ways.

> The default should be to match behavior that's been stable for decades.

As I tried to explain above, I don't necessarily agree.

> Use of tree sitter should be an implementation detail for users.

Since the introduction of tree-sitter based capabilities into Emacs,
we've learned that this simply doesn't work, not in Emacs.  Syntax and
semantics leak into our UI, and tree-sitter deals with syntactic and
semantic information that is sometimes very different from what, e.g.,
syntax-ppss and friends let us use.

So I do understands where you are coming from, but experience taught
us that it cannot work that way in Emacs.  If we were designing Emacs
from scratch today, perhaps we could have done that in a way that
would avoid these leaks, but we are not there.

> If we want to provide UI to better handle nested defuns, this UI should
> go in prog-mode.el and rely on mode-provided syntactic analysis, not
> just delegate to a mode function that does different random stuff in
> each mode.

That'd be a massive rewrite of gobs of existing code, I'm afraid.  I
invite you to take a look at the existing code and see how it mixes
syntax with UI.  That's even visible at the level of the command
names: "sexp" only makes sense in Lisp, and the notion of "balanced
parens" has no place in languages without brackets and braces.

> >> Furthermore, consider the following `top-level` tactic example using
> >> `c++-ts-mode`.  Here, we have a C++ namespace (which is considered a
> >> defun for C++)
> 
> Namespaces aren't defuns and c++-ts-mode shouldn't be indenting their
> contents by a level either.

But c++-mode does indent them.  Doesn't this contradict what you said
about following past practices?




This bug report was last modified 57 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.