Package: emacs;
Reported by: Daniel Colascione <dancol <at> dancol.org>
Date: Thu, 5 Jun 2025 23:41:02 UTC
Severity: normal
View this message in rfc822 format
From: Daniel Colascione <dancol <at> dancol.org> To: Eli Zaretskii <eliz <at> gnu.org> Cc: casouri <at> gmail.com, brownts <at> troybrown.dev, 78703 <at> debbugs.gnu.org Subject: bug#78703: beginning-of-defun and friends still wrong in typescript-ts-mode Date: Tue, 10 Jun 2025 09:55:59 -0700
On June 10, 2025 9:19:03 AM PDT, Eli Zaretskii <eliz <at> gnu.org> wrote: >> Date: Tue, 10 Jun 2025 08:49:48 -0700 >> From: Daniel Colascione <dancol <at> dancol.org> >> CC: casouri <at> gmail.com, brownts <at> troybrown.dev, 78703 <at> debbugs.gnu.org >> >> >> If I'm using, say, c++-ts-mode, my navigation commands should do the >> >> same thing they do in c++-mode. Plenty of code as well as muscle >> >> memories rely on this behavior. >> > >> >I'm not sure I agree. That c++-mode behaved like that doesn't mean >> >it's the last word, or that nothing can be improved in that behavior. >> >> It's not just c++-mode. It's how most modes have behaved. >> >> >In addition, TS-based modes make certain behaviors very hard (at least >> >not if we base it on the parser information), >> >> Very hard how? You can use tree sitter like a more powerful parse partial sexp. It provides strictly more information. > >Not really, not when you look closely. The tools we've built before >tree-sitter are ad-hoc, so they allow us to provide information that >parsers don't have and don't need to have. Our syntax tables are not >exactly "less efficient parsing", and regular expressions allow us to >match whatever we want and call that anything we want. There is nothing one bit of information available to c-mode not available to c-ts-mode. >Take the DEFUN recognition by CC mode as an example. Tree-sitter >knows nothing about them. > >So "strictly more information" is perhaps an expectation, but it >breaks at closer looking. > >> There is no behavior whatsoever that's harder to implement because tree sitter is giving you more information. > >I invite you to look at c-ts-mode sources. You will see plenty of >what was "harder to implement". That's a choice. > We still lack some useful >functionalities that are present in CC Mode, forrea that very reason; >what was easy to implement was done long ago. That's the opposite of reality. Alan and others have spent years building a flexible and fast backtracking syntactic analyser for CC mode. Tree sitter does the same thing but in a more general way, in native code for better performance. Its availability makes doing what cc-mode does easier, not harder. >> >> The "tactic" concept is an unnecessary layer of indirection. >> > >> >From where I stand, it's a new feature that was unavailable in non-TS >> >implementation. >> >> It's not a feature. It's a UI and programming annoyance. How are you supposed to write code against functions with behavior that shifts on a whim with no stable functions to call instead? > >We've been doing that since day one: you write code that looks at the >variables to figure out what behavior to expect, or you write code >that is general enough to not care. I can't wait to program against our glorious new dwim function. The problem with let binding the world is that the set of dynamic inputs becomes unbounded. What if I just bind the strategy option and one day TS introduces, say, a new sub-strategy option that makes the function I call behave differently? To the extent possible, and in the strategy case for TS mode it's certainly possible, commands should do one thing and if you want to do a different thing, you run a different command. >> What am I supposed to do, let-bind every possible value around every function call? > >Sometimes, yes. Although hopefully not so frequently and not "every >possible value". > >> Why have functions then. Let's just have one function with a strategy. > >Arguments "ad absurdum" are not always useful. In this case, no one >is calling for such an extremity. But sometimes this has to be done. > >> >But if the semantics of a command is ambiguous, then a switch makes >> >perfect sense. In this case, what exactly "beginning of defun" means >> >when there are nested defuns is ambiguous. >> >> Yet we use the concept of command names to express different concepts elsewhere. And if the concept is ambiguously defined, provide a minimal knob to adjust that concept, not change the operation of primitives to be inconsistent with each other. > >I think we do the former, or at least we try. Then let's make separate commands to express moving to a defun boundary one way versus another way and let users express their preference for connecting input to action using keymaps. >> >> Why does there need to be a tree-sitter version of >> >> prog-fill-reindent-defun? >> > >> >Because the way to get the indentation information from tree-sitter is >> >significantly different from the ad-hoc ways we do that in >> >"traditional" modes. >> >> No it isn't. If the defun navigation functions in TS modes had their traditional behavior, they'd continue to work for higher level constructs built on top of them like the prog-mode reindent and mark defun. TS modes broke a whole bunch of things that had worked fine for decades, and instead of fixing them, they just made even more abstractions to plug inconsistent tree sitter things in place of the broken things. > >Indentation is a lot more than just navigation. And I disagree with >you extreme interpretation of the current state of indentation and >navigation support in TS-based modes. I'm right. >> >How do you implement anything like c-set-offset or indentation styles >> >based only on low-level syntactic analysis? >> >> By using TS to implement c-guess-basic-syntax and friends. > >Did you look at the implementation of how c-set-offset encode >indentation information? Did you try to think how to get the same >information from tree-sitter? If you did, and found the way, how >about implementing c-ts-set-offset? I Think it's sorely missed. > >> cc-mode indentation styles are clear expressions of user intent. No reason at all TS modes couldn't respect this intent and merely implement it a different way. Want to know whether you're after a class? Inside a namespace? Where a declaration begins? You have an AST right there! > >Sorry, this is simplification. A typical declaration breaks down into >smaller parts, and we have expectations and ideas about indentation of >each one of them. But the tree-sitter classification of the AST >constituents does not necessarily make that easy, because you could >have the same syntactic symbol both inside a declaration and in other >places. So having an AST does not always immediately tell you how to >indent correctly. No, but it gives you more information than looking-at does, and cc-mode does its job admirable given only that simple tool. You can look at nesting and context in the TS AST to figure out what to do. >> > where will the rest of the >> >necessary information come from, and who and how will apply it? >> >> From the AST. Where else? > >See above. > >> >I think the point in the above example was that the semantic of "sexp" >> >is ambiguous in any language that is not Lisp. That was (and still >> >is) the hard part of figuring out how forward-sexp should behave in >> >TS-based modes. (In non-TS modes the behavior is just arbitrary >> >nonsense, if you ask me.) >> >> Yes, and because it's ambiguous we get annoyances like python-mode's default sexp movement. Now every mode is like that, and you can't turn it off half the time? > >What else did you expect? Some users like one style, others like the >other. Are we supposed to say "my way or the highway"? And that's >even before we consider that the disagreement cuts through the >developers themselves. No. I'm expecting a generally consistent experience, and if we want to provide a configuration knob, it should affect everything consistently. One shouldn't have to form independent and different muscle memory for each language mode because the whims of their authors were different. >I find continuing this kind of argument not constructive, so I will >stop here. Let me just say that I think you are looking at this stuff >from some semi-abstract, almost idealistic, aspect. As if we didn't >have 40 years of development and user experience and expectations to >keep and uphold. I'll never understand the mindset that holds that things making sense and having a structure is bad actually because sense and structure are "academic" and "idealistic". What 40 years of development and user experience holds is that if I'm four pages deep into a nasty TypeScript function and hit beginning-of-defun, I want to go to the beginning of the four page defun I am editing and not some random place two pages up that I didn't even know about in which someone scribbled out some kind of nested lambda irrelevant to my present task. >> >> >> Furthermore, consider the following `top-level` tactic example using >> >> >> `c++-ts-mode`. Here, we have a C++ namespace (which is considered a >> >> >> defun for C++) >> >> >> >> Namespaces aren't defuns and c++-ts-mode shouldn't be indenting their >> >> contents by a level either. >> > >> >But c++-mode does indent them. Doesn't this contradict what you said >> >about following past practices? >> >> In c++-mode, I can turn it off with a documented user knob. > >You've changed the subject. But by all means, let's add such a knob >to c++-ts-mode, sure. > >> In c++-ts-mode, I have to write fragile hacks to monkeypatch mode internals. They're not the same thing. And no matter what indent style I choose in c++-mode, namespace isn't magically a defun. > >But beginning-of-defun nevertheless takes me to the beginning of the >namespace in c++-mode.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.