Package: emacs;
Reported by: Daniel Colascione <dancol <at> dancol.org>
Date: Thu, 5 Jun 2025 23:41:02 UTC
Severity: normal
Message #47 received at 78703 <at> debbugs.gnu.org (full text, mbox):
From: Yuan Fu <casouri <at> gmail.com> To: Daniel Colascione <dancol <at> dancol.org> Cc: Eli Zaretskii <eliz <at> gnu.org>, Troy Brown <brownts <at> troybrown.dev>, 78703 <at> debbugs.gnu.org Subject: Re: bug#78703: beginning-of-defun and friends still wrong in typescript-ts-mode Date: Tue, 10 Jun 2025 12:11:34 -0700
I’m hesitant to get into another long debate, but let me give my two cents. >> Consider that existing code in Emacs isn’t set in stone, it’s not >> really that fundamental that beginning-of-defun must work in a way >> that makes prog-fill-reindent-defun behave >> desirably. > > Experimenting with a different UI paradigm doesn't justify introducing > an inconsistency to behavior that's worked well for literally decades. > If I'm using, say, c++-ts-mode, my navigation commands should do the > same thing they do in c++-mode. Plenty of code as well as muscle > memories rely on this behavior. > > The "tactic" concept is an unnecessary layer of indirection. > If operation A takes you to place X and operation B takes you to > different place Y, the way to express the difference between operations > A and B is to make them _different commands_, not by twiddling some > global switch. c-defun-tactic exists since Emacs 24, treesit-defun-tactic is actually modeled after it. Treesit-defun-tactic’s `top-level` and `nested` corresponds to c-defun-tactic’s `t` and `go-outward`. The only difference is that tree-sitter defaults to the `nested` option. So we’re not breaking the UI paradigm as you imagined. Also c++-ts-mode is a new separate mode, and provides vastly different features to c++-mode, so some difference is warranted. A lot of features in c++-mode is missing simply because we don’t have the resource to recreate them in c++-ts-mode. After all, Alan and others spend decades developing c++-mode. We also can’t just build c++-ts-mode on top of c++-mode because the engine in c++ mode is very heavy (meaning slow), if we use both tree-sitter parser (which has its own overhead) and the cc engine, it’ll be even slower than c++-mode for not much benefit. It just makes no sense. Also, cc-mode is a huge codebase, and I don’t have expertise in it. If Alan or some other cc-mode expert decides to incorporate tree-sitter into cc-mode, they might get strictly more information as you claimed. But I and other people willing to work on c++-ts-mode don’t have that ability. (Also Theo is the maintainer of c++-ts-mode, I’m just a co-maintainer trying to fix things up when he’s not available.) > When you change tree-sitter "strategies" right now, you're silently > turning one command into another command, that's confusing for everyone. > Please give these "strategies" individual command names. We have > beginning-of-defun and beginning-of-defun-comments, not a knob that > alters beginning-of-defun. Using variables to alter a commands behavior is a very well accepted practice, it’s everywhere in Emacs. Both approach are equally valid and are suitable for difference circumstances. In our case, there’s already the long-existed tactic concept so we want to keep to it, and also people usually only want the command behave one way or another, so they can just set the variable and be done with it. Multiple commands are more suitable for cases where the user wants to use different behaviors in the same time, then they can bind each command to different keys. Multiple commands have their own downsides like more hassle to configure and harder to discover, so they’re not strictly better. >> prog-fill-reindent-defun uses beginning-of-defun because we >> don’t have better choices before tree-sitter. In tree-sitter major >> modes, what we’ve been doing is to make the existing commands >> customizable so tree-sitter can provide a tree-sitter version of >> it. > > Why does there need to be a tree-sitter version of > prog-fill-reindent-defun? Isn't it enough that tree-sitter provide the > low-level syntactic analysis for prog-fill-reindent-defun to do its job? > Why the high level hook? If we want to use a lower-level abstraction, we’d need to create some framework to get the defun at point, and add a tree-sitter provider of it. That just brings a whole can of worms: what should this API look like exactly? Should we change all other existing functions (many of which are hundred line monsters that are hard to understand and refactor without introducing breakage) to use this new framework? Should it limit to only defun, or should it support other constructs? If we want to support other constructs, how do we implement the provider for those constructs in non-tree-sitter settings? Many other functions that use beginning/end-of-defun needs more than getting the defun at point, they might want to know if point is at the beginning of a defun, they might want to know if there’s another defun after this one. How do we design the framework so these requests are met? For the reasons above, thing-at-point isn’t really suitable for our purpose, or at least we need to make significant changes to its API to make it suitable. Thing-at-point is mostly a simple API to get things like url and symbol and word at point, not something that can give us rich information of syntactic construct at point. You might think prog-fill-reindent-defun can just use tree-sitter functions to get syntax information, but that’s a big no-no. Existing Emacs features like imenu, font-lock, prog-fill-reindent-defun, etc, shouldn’t even know about tree-sitter, they are user-facing and only provide customization points for major modes/tree-sitter/eglot/etc to plugin and provide functionality. So things aren’t that simple if you start thinking about how to actually doing it. If you can design something that give the right abstraction that major mode/tree-sitter/eglot/etc can plugin to, and can make it work for both non-tree-sitter and tree-sitter, and potential future tools that provide parse trees, I’ll be very happy, because we do want that. So given all that consideration, I decided to go with a higher-level abstraction you saw. After my refactor of prog-fill-reindent-defun, reimplementing it for tree-sitter is just 8 lines of code. The small code duplication is perfectly acceptable IMO. A higher-level abstraction also has plenty benefits: major modes can define their own version if it’s necessary, we have more freedom in our implementation because there’s a looser coupling, etc. > >> We’ve done this for forward-sexp: we added >> forward-sexp-function. Some commands already have customization points >> long ago, like beginning-of-defun, which has >> beginning-of-defun-function. > > Modes use generally these "customization points" to _implement_ the > familiar behavior, not to give them random different > user-visible semantics. Let me also reply to your later messages here: > It's not a feature. It's a UI and programming annoyance. How are you supposed to write code against functions with behavior that shifts on a whim with no stable functions to call instead? > > Yet we use the concept of command names to express different concepts elsewhere. And if the concept is ambiguously defined, provide a minimal knob to adjust that concept, not change the operation of primitives to be inconsistent with each other. I agree that beginning/end-of-defun has evolved into a kind of primitive function for getting the defun at point, because we didn’t have a parse tree or an API for getting defun at point. But I disagree your conclusion that one can’t write code against beginning/end-of-defun that can change behavior. I mean, that’s the point of abstraction, no? User can choose they want top-level or nested kind of defun, and by changing that, all the functions that use beginning/end-of-defun automatically switch to the chosen behavior. Isn’t that better than having no choice? But of course there'll be cases where the abstraction doesn’t work, like prog-fill-reindent-defun. It’s mostly because beginning/end-of-defun is a god awful abstraction, but that’s what all the existing functions use and we have to live with it, until someone creates a better abstraction and refactor all the existing functions to use it. Anyway, in cases where the abstraction doesn’t work well, we just need to do a bit more to make it work, and I added prog-fill-reindent-defun-function. >> So I added prog-fill-reindent-defun-function and a tree-sitter version >> treesit-fill-reindent-defun. The tree-sitter implementation uses >> treesit-defun-at-point, so it doesn’t even need to concern >> with tactics. >> >> Now in tree-sitter major modes, prog-fill-reindent-defun should always >> indent the enclosing defun. > > Which now means prog-fill-reindent-defun can indent something other than > what mark-defun highlights? That seems odd to me. Tree sitter's job is > syntactic analysis, not UI differentiation. > >>> After that, I'm less concerned about its behavior, but think the >>> current `nested` behavior of visiting the previous sibling defun until >>> there are no more, then visiting the parent defun makes sense. The >>> key difference is that you only visit the previous sibling if point is >>> at a defun boundary...otherwise visit the defun containing point. >> >> Hmmm, it doesn’t feel very convenient, you’d need to first adjust your >> point to be precisely at the boundary, then press C-M-a/e? IMO that >> adds too much overhead. I added a tactic `parent-first` that always >> move to the beginning/end of the enclosing defun. People that prefers >> this kind of defun movement can use this tactic. > > The default should be to match behavior that's been stable for decades. > Use of tree sitter should be an implementation detail for users. > > If we want to provide UI to better handle nested defuns, this UI should > go in prog-mode.el and rely on mode-provided syntactic analysis, As I described above, it’s not as simple as you imagined. > not > just delegate to a mode function that does different random stuff in > each mode. If the mode use tree-sitter provided default (which they 99.99% will), the behavior is well defined. Even if major mode decide to provide their own version, I trust the major mode author to know what they’re doing and have a good reason to implement their own, and provide a function that fits the docstring of prog-fill-reindent-defun. So no, it won’t be random stuff. >>> Furthermore, consider the following `top-level` tactic example using >>> `c++-ts-mode`. Here, we have a C++ namespace (which is considered a >>> defun for C++) > > Namespaces aren't defuns and c++-ts-mode shouldn't be indenting their > contents by a level either. c++-ts-mode is unusable without hacks like > pragmatically editing the indentation rules in user configuration. If it’s a bug in the indentation rule, we can fix it. But can you open another bug report and describe what exactly is wrong? Yuan
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.