GNU bug report logs - #78703
beginning-of-defun and friends still wrong in typescript-ts-mode

Previous Next

Package: emacs;

Reported by: Daniel Colascione <dancol <at> dancol.org>

Date: Thu, 5 Jun 2025 23:41:02 UTC

Severity: normal

Full log


Message #41 received at 78703 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Colascione <dancol <at> dancol.org>
Cc: casouri <at> gmail.com, brownts <at> troybrown.dev, 78703 <at> debbugs.gnu.org
Subject: Re: bug#78703: beginning-of-defun and friends still wrong in
 typescript-ts-mode
Date: Tue, 10 Jun 2025 19:19:03 +0300
> Date: Tue, 10 Jun 2025 08:49:48 -0700
> From: Daniel Colascione <dancol <at> dancol.org>
> CC: casouri <at> gmail.com, brownts <at> troybrown.dev, 78703 <at> debbugs.gnu.org
> 
> >> If I'm using, say, c++-ts-mode, my navigation commands should do the
> >> same thing they do in c++-mode.  Plenty of code as well as muscle
> >> memories rely on this behavior.
> >
> >I'm not sure I agree.  That c++-mode behaved like that doesn't mean
> >it's the last word, or that nothing can be improved in that behavior.
> 
> It's not just c++-mode. It's how most modes have behaved.
> 
> >In addition, TS-based modes make certain behaviors very hard (at least
> >not if we base it on the parser information), 
> 
> Very hard how? You can use tree sitter like a more powerful parse partial sexp. It provides strictly more information.

Not really, not when you look closely.  The tools we've built before
tree-sitter are ad-hoc, so they allow us to provide information that
parsers don't have and don't need to have.  Our syntax tables are not
exactly "less efficient parsing", and regular expressions allow us to
match whatever we want and call that anything we want.

Take the DEFUN recognition by CC mode as an example.  Tree-sitter
knows nothing about them.

So "strictly more information" is perhaps an expectation, but it
breaks at closer looking.

> There is no behavior whatsoever that's harder to implement because tree sitter is giving you more information. 

I invite you to look at c-ts-mode sources.  You will see plenty of
what was "harder to implement".  We still lack some useful
functionalities that are present in CC Mode, for that very reason;
what was easy to implement was done long ago.

> >> The "tactic" concept is an unnecessary layer of indirection.
> >
> >From where I stand, it's a new feature that was unavailable in non-TS
> >implementation.
> 
> It's not a feature. It's a UI and programming annoyance. How are you supposed to write code against functions with behavior that shifts on a whim with no stable functions to call instead?

We've been doing that since day one: you write code that looks at the
variables to figure out what behavior to expect, or you write code
that is general enough to not care.

> What am I supposed to do, let-bind every possible value around every function call?

Sometimes, yes.  Although hopefully not so frequently and not "every
possible value".

> Why have functions then. Let's just have one function with a strategy.

Arguments "ad absurdum" are not always useful.  In this case, no one
is calling for such an extremity.  But sometimes this has to be done.

> >But if the semantics of a command is ambiguous, then a switch makes
> >perfect sense.  In this case, what exactly "beginning of defun" means
> >when there are nested defuns is ambiguous.
> 
> Yet we use the concept of command names to express different concepts elsewhere. And if the concept is ambiguously defined, provide a minimal knob to adjust that concept, not change the operation of primitives to be inconsistent with each other. 

I think we do the former, or at least we try.

> >> Why does there need to be a tree-sitter version of
> >> prog-fill-reindent-defun?
> >
> >Because the way to get the indentation information from tree-sitter is
> >significantly different from the ad-hoc ways we do that in
> >"traditional" modes.
> 
> No it isn't. If the defun navigation functions in TS modes had their traditional behavior, they'd continue to work for higher level constructs built on top of them like the prog-mode reindent and mark defun. TS modes broke a whole bunch of things that had worked fine for decades, and instead of fixing them, they just made even more abstractions to plug inconsistent tree sitter things in place of the broken things.

Indentation is a lot more than just navigation.  And I disagree with
you extreme interpretation of the current state of indentation and
navigation support in TS-based modes.

> >How do you implement anything like c-set-offset or indentation styles
> >based only on low-level syntactic analysis? 
> 
> By using TS to implement c-guess-basic-syntax and friends.

Did you look at the implementation of how c-set-offset encode
indentation information?  Did you try to think how to get the same
information from tree-sitter?  If you did, and found the way, how
about implementing c-ts-set-offset? I Think it's sorely missed.

> cc-mode indentation styles are clear expressions of user intent. No reason at all TS modes couldn't respect this intent and merely implement it a different way. Want to know whether you're after a class? Inside a namespace? Where a declaration begins? You have an AST right there!

Sorry, this is simplification.  A typical declaration breaks down into
smaller parts, and we have expectations and ideas about indentation of
each one of them.  But the tree-sitter classification of the AST
constituents does not necessarily make that easy, because you could
have the same syntactic symbol both inside a declaration and in other
places.  So having an AST does not always immediately tell you how to
indent correctly.

> > where will the rest of the
> >necessary information come from, and who and how will apply it?
> 
> From the AST. Where else?

See above.

> >I think the point in the above example was that the semantic of "sexp"
> >is ambiguous in any language that is not Lisp.  That was (and still
> >is) the hard part of figuring out how forward-sexp should behave in
> >TS-based modes.  (In non-TS modes the behavior is just arbitrary
> >nonsense, if you ask me.)
> 
> Yes, and because it's ambiguous we get annoyances like python-mode's default sexp movement. Now every mode is like that, and you can't turn it off half the time?

What else did you expect?  Some users like one style, others like the
other.  Are we supposed to say "my way or the highway"?  And that's
even before we consider that the disagreement cuts through the
developers themselves.

I find continuing this kind of argument not constructive, so I will
stop here.  Let me just say that I think you are looking at this stuff
from some semi-abstract, almost idealistic, aspect.  As if we didn't
have 40 years of development and user experience and expectations to
keep and uphold.

> >> >> Furthermore, consider the following `top-level` tactic example using
> >> >> `c++-ts-mode`.  Here, we have a C++ namespace (which is considered a
> >> >> defun for C++)
> >> 
> >> Namespaces aren't defuns and c++-ts-mode shouldn't be indenting their
> >> contents by a level either.
> >
> >But c++-mode does indent them.  Doesn't this contradict what you said
> >about following past practices?
> 
> In c++-mode, I can turn it off with a documented user knob.

You've changed the subject.  But by all means, let's add such a knob
to c++-ts-mode, sure.

> In c++-ts-mode, I have to write fragile hacks to monkeypatch mode internals. They're not the same thing. And no matter what indent style I choose in c++-mode, namespace isn't magically a defun.

But beginning-of-defun nevertheless takes me to the beginning of the
namespace in c++-mode.




This bug report was last modified 4 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.