GNU bug report logs - #60656
30.0.50; tree-sitter: editing a buffer invalidates visited node instances

Previous Next

Package: emacs;

Reported by: Mickey Petersen <mickey <at> masteringemacs.org>

Date: Sun, 8 Jan 2023 11:09:02 UTC

Severity: normal

Found in version 30.0.50

Full log


Message #8 received at 60656 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Mickey Petersen <mickey <at> masteringemacs.org>
Cc: 60656 <at> debbugs.gnu.org
Subject: Re: bug#60656: 30.0.50; tree-sitter: editing a buffer invalidates 
 visited node instances
Date: Sun, 8 Jan 2023 19:57:32 -0800
Mickey Petersen <mickey <at> masteringemacs.org> writes:

> If you parse some text, retrieve a node -- using `treesit-node-at',
> for example -- and then edit the buffer, then the node you retrieved
> is marked outdated.
>
> However, tree-sitter is capable of handling that, to a greater or lesser extent:
>
> https://tree-sitter.github.io/tree-sitter/using-parsers#editing
>
> It is therefore possible to refresh node instances that were created
> _before_ the edit. I suppose it could remain an explicit step that you
> must enter a special form and then Emacs will track node instances
> issued inside that form and refresh them when edits take place inside
> of it.
>
> As it stands, it is very hard to edit and maintain a node registry at
> the same time. (I'm using markers and overlays as a crude hack to work
> around it.)

This is kind of a limitation of tree-sitter. The "node editing" isn’t
like what you thought (it fooled me too when I first read it).
Tree-sitter’s incremental parsing works roughly like this:

1. You have a parsed tree, TREE, corresponding to some TEXT
2. You make some edit to the TEXT, eg, TEXT’ = insert(TEXT, 1, "abc")
3. Now you need to "edit" the old tree with _positions_ of your edit:
edit(TREE, Insert(pos=1, len=3)) (Notice that this modifies the tree in-place.)
4. You reparse the edited tree and gets a new tree:
TREE’ = parse(TREE, TEXT’) (Notice that this returns a new tree.)

If you have a NODE from TREE, editing that node only updates position
information. That corresponds to the eidt(TREE, ...) step. There is no
equivalent of the parse(TREE, TEXT’) step for nodes: once the tree is
reparsed and a new tree is returned, none of the nodes in the old tree
gets carried to the new tree. In practice, tree-sitter reuses old tree’s
data, but conceptually the old and new tree don’t share any node.

IOW, the editing feature for nodes is for very specific situations,
where you edit the parse tree but didn’t reparse yet. In this case, if
you want to make your node’s positions to be correct, you edit the node.
But once you reparse, there is no way to somehow "update" this old node
into its "equivalent" in the new tree.

I’m not sure whether tree-sitter is capable to do what you want (after
all the old and new tree are sharing data). But currently it doesn’t
expose the feature to do that.

Yuan




This bug report was last modified 2 years and 160 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.