GNU bug report logs - #78703
beginning-of-defun and friends still wrong in typescript-ts-mode

Previous Next

Package: emacs;

Reported by: Daniel Colascione <dancol <at> dancol.org>

Date: Thu, 5 Jun 2025 23:41:02 UTC

Severity: normal

Full log


View this message in rfc822 format

From: Daniel Colascione <dancol <at> dancol.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: casouri <at> gmail.com, brownts <at> troybrown.dev, 78703 <at> debbugs.gnu.org
Subject: bug#78703: beginning-of-defun and friends still wrong in typescript-ts-mode
Date: Tue, 10 Jun 2025 08:49:48 -0700

On June 10, 2025 5:12:22 AM PDT, Eli Zaretskii <eliz <at> gnu.org> wrote:
>> From: Daniel Colascione <dancol <at> dancol.org>
>> Cc: Troy Brown <brownts <at> troybrown.dev>,  Eli Zaretskii <eliz <at> gnu.org>,
>>   78703 <at> debbugs.gnu.org
>> Date: Tue, 10 Jun 2025 00:24:54 -0700
>> 
>> If I'm using, say, c++-ts-mode, my navigation commands should do the
>> same thing they do in c++-mode.  Plenty of code as well as muscle
>> memories rely on this behavior.
>
>I'm not sure I agree.  That c++-mode behaved like that doesn't mean
>it's the last word, or that nothing can be improved in that behavior.

It's not just c++-mode. It's how most modes have behaved.

>In addition, TS-based modes make certain behaviors very hard (at least
>not if we base it on the parser information), 

Very hard how? You can use tree sitter like a more powerful parse partial sexp. It provides strictly more information. There is no behavior whatsoever that's harder to implement because tree sitter is giving you more information. 

> and OTOH make certain
>behaviors very easy that were hard with the "traditional" modes.  So
>we should keep an open mind about these aspects, and not automatically
>demand 110% compatibility to past behavior.
>
>> The "tactic" concept is an unnecessary layer of indirection.
>
>From where I stand, it's a new feature that was unavailable in non-TS
>implementation.

It's not a feature. It's a UI and programming annoyance. How are you supposed to write code against functions with behavior that shifts on a whim with no stable functions to call instead?

What am I supposed to do, let-bind every possible value around every function call? Why have functions then. Let's just have one function with a strategy.

(let ((dwim-strategy 'call-process)) (dwim "date -R"))

(let ((dwim-strategy 'switch-to-buffer)) (dwim "*scratch*"))

We already have a knob for users to express the concept of what happens when a key is pressed: the command binding mechanism.

org-mode is annoying in this way too. I'm in org mode. I want to see what, say, C-tab does. I type C-h c C-tab and I get something like "org-dwim-control-tab".

Yeah, that's useful.

Why bother having keymaps at all? Org made an inner platform for key binding. Inner platforms are bad and hurt generality.

>> If operation A takes you to place X and operation B takes you to
>> different place Y, the way to express the difference between operations
>> A and B is to make them _different commands_, not by twiddling some
>> global switch.
>
>But if the semantics of a command is ambiguous, then a switch makes
>perfect sense.  In this case, what exactly "beginning of defun" means
>when there are nested defuns is ambiguous.

Yet we use the concept of command names to express different concepts elsewhere. And if the concept is ambiguously defined, provide a minimal knob to adjust that concept, not change the operation of primitives to be inconsistent with each other. 

>> When you change tree-sitter "strategies" right now, you're silently
>> turning one command into another command, that's confusing for everyone.
>> Please give these "strategies" individual command names.  We have
>> beginning-of-defun and beginning-of-defun-comments, not a knob that
>> alters beginning-of-defun.
>
>I don't want to memorize two commands when one will do.  That's why we
>have the various optional behaviors of commands and DWIM-ish
>variations in their behavior.

This isn't DW*I*M and it's hard to imagine the current default of going to the previous lexical function beginning being what many people mean.

>> > prog-fill-reindent-defun uses beginning-of-defun because we
>> > don’t have better choices before tree-sitter. In tree-sitter major
>> > modes, what we’ve been doing is to make the existing commands
>> > customizable so tree-sitter can provide a tree-sitter version of
>> > it.
>> 
>> Why does there need to be a tree-sitter version of
>> prog-fill-reindent-defun?
>
>Because the way to get the indentation information from tree-sitter is
>significantly different from the ad-hoc ways we do that in
>"traditional" modes.

No it isn't. If the defun navigation functions in TS modes had their traditional behavior, they'd continue to work for higher level constructs built on top of them like the prog-mode reindent and mark defun. TS modes broke a whole bunch of things that had worked fine for decades, and instead of fixing them, they just made even more abstractions to plug inconsistent tree sitter things in place of the broken things.

You apply this procedure repeatedly and you get a new editor, and going by the defaults I've seen from the TS modes, it's not a better editor. 

>> Isn't it enough that tree-sitter provide the
>> low-level syntactic analysis for prog-fill-reindent-defun to do its job?
>> Why the high level hook?
>
>How do you implement anything like c-set-offset or indentation styles
>based only on low-level syntactic analysis? 

By using TS to implement c-guess-basic-syntax and friends. cc-mode indentation styles are clear expressions of user intent. No reason at all TS modes couldn't respect this intent and merely implement it a different way. Want to know whether you're after a class? Inside a namespace? Where a declaration begins? You have an AST right there!

> where will the rest of the
>necessary information come from, and who and how will apply it?

From the AST. Where else?

>> > We’ve done this for forward-sexp: we added
>> > forward-sexp-function. Some commands already have customization points
>> > long ago, like beginning-of-defun, which has
>> > beginning-of-defun-function.
>> 
>> Modes use generally these "customization points" to _implement_ the
>> familiar behavior, not to give them random different
>> user-visible semantics.
>
>I think the point in the above example was that the semantic of "sexp"
>is ambiguous in any language that is not Lisp.  That was (and still
>is) the hard part of figuring out how forward-sexp should behave in
>TS-based modes.  (In non-TS modes the behavior is just arbitrary
>nonsense, if you ask me.)

Yes, and because it's ambiguous we get annoyances like python-mode's default sexp movement. Now every mode is like that, and you can't turn it off half the time?

The key is the *relationships* better the commands that help users form mental models of what their actions are going to do. For example, if blink-paren-mode highlights the other end of some balanced construct, forward or backward-sexp will take you there. Easy to learn and predict. Likewise, beginning of defun should move to the start of the point that mark-defun highlights, and indent defun should indent the same part of the buffer that mark-defun highlights.

That's why it's just weird to have a TS hook specifically for indenting a defun: it just invites the kind of inconsistency that makes the system hard to reason about and annoying to work with.

>> Tree sitter's job is syntactic analysis, not UI differentiation.
>
>The way we use syntactic information in this commands is a leaky
>abstraction: the syntax aspects leak into the UI. 

It doesn't have to. There is nothing about the additional information TS provides that *forces* you to implement beginning-of-defun in a way that fails to respect program hierarchy. That was a choice, and the existence of this "strategy" system shows it.

 So it is a small
>wonder that tree-sitter affects the UI in some (relatively minor)
>ways.

No, it breaks the UI in unnecessary ways.

>> The default should be to match behavior that's been stable for decades.
>
>As I tried to explain above, I don't necessarily agree.
>
>> Use of tree sitter should be an implementation detail for users.
>
>Since the introduction of tree-sitter based capabilities into Emacs,
>we've learned that this simply doesn't work, not in Emacs.  

It would work fine if people cared about UI consistency. Subtle differences in semantic analysis are to be expected. Gross behavioral differences in long-stable and otherwise consistent commands are not.

That's like saying cars have inconsistent varying UIs, so when you should buy an electric car, you should *expect* the steering wheel to be in the right, rear seat. Hey, power train is a leaky abstraction!

> Syntax and
>semantics leak into our UI, and tree-sitter deals with syntactic and
>semantic information that is sometimes very different from what, e.g.,
>syntax-ppss and friends let us use.
>
>So I do understands where you are coming from, but experience taught
>us that it cannot work that way in Emacs.  If we were designing Emacs
>from scratch today, perhaps we could have done that in a way that
>would avoid these leaks, but we are not there.
>
>> If we want to provide UI to better handle nested defuns, this UI should
>> go in prog-mode.el and rely on mode-provided syntactic analysis, not
>> just delegate to a mode function that does different random stuff in
>> each mode.
>
>That'd be a massive rewrite of gobs of existing code, I'm afraid. 

Would it? How? You'd start with a baseline no different way and let gradually increasing lexical and syntactic knowledge provided by modes (perhaps using TS as a backend) add capabilities. forward-class, for example, might just signal in modes that didn't provide a definition of that construct.

Yes, languages have different ideas about what constitutes a function, but the idea of nesting constructs is common enough across languages that it ought to come with a roughly consistent way to navigate them.

> I
>invite you to take a look at the existing code and see how it mixes
>syntax with UI.  That's even visible at the level of the command
>names: "sexp" only makes sense in Lisp, and the notion of "balanced
>parens" has no place in languages without brackets and braces.
>
>> >> Furthermore, consider the following `top-level` tactic example using
>> >> `c++-ts-mode`.  Here, we have a C++ namespace (which is considered a
>> >> defun for C++)
>> 
>> Namespaces aren't defuns and c++-ts-mode shouldn't be indenting their
>> contents by a level either.
>
>But c++-mode does indent them.  Doesn't this contradict what you said
>about following past practices?

In c++-mode, I can turn it off with a documented user knob. In c++-ts-mode, I have to write fragile hacks to monkeypatch mode internals. They're not the same thing. And no matter what indent style I choose in c++-mode, namespace isn't magically a defun.




This bug report was last modified 4 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.