GNU bug report logs - #60054
29.0.60; Infinite loop when there are cyclic path in the parse tree

Previous Next

Package: emacs;

Reported by: Yuan Fu <casouri <at> gmail.com>

Date: Wed, 14 Dec 2022 00:12:02 UTC

Severity: normal

Found in version 29.0.60

Done: Yuan Fu <casouri <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #11 received at 60054 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 60054 <at> debbugs.gnu.org
Subject: Re: bug#60054: 29.0.60; Infinite loop when there are cyclic path in
 the parse tree
Date: Wed, 14 Dec 2022 12:27:58 -0800

> On Dec 14, 2022, at 4:08 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Tue, 13 Dec 2022 16:11:01 -0800
>> 
>> 
>> This is not really an Emacs bug, but either tree-siter-c or
>> tree-sitter’s. I’m putting it out here so that if I’m hit by a bus
>> tomorrow, and treesit-search-forward-goto and friends hang, 
>> we (eh, you) know what’s going on.
>> 
>> I’ve submitted an issue here:
>> https://github.com/tree-sitter/tree-sitter-c/issues/119
>> 
>> So far, I’ve only observed this in that specific edge case.
> 
> We should have protection against that, which should be easy, right?

Just to make sure, we want to use something like slow-fast pointers, where we have two pointers, and one goes twice as fast, right? That’s the one I was taught in school :-)

The author advices to use cursors for traversing the tree, as cursors doesn’t have this bug. He also advised against using ts_node_parent and said that they could be removed in the future. I didn’t use cursors in the first place because they can’t traverse the tree backwards, ie, no equivalent of ts_node_prev_sibling, and the performance difference is not significant in Emacs settings. But I just went to look at the source, and it seems ts_node_prev_siblings(node) is implemented by just iterating each children from first to last, until it finds the child just before NODE, LOL[1]. I can do similar things in treesit.c with cursors. By doing that we can fix this problem and be future-proof.

In summary, I’m proposing: 
1. I add the slow-fast pointer checks in treesit.c and treesit.el
2. I replace ts_node_parent/sibling/child with using cursors in tree-traversal functions in treesit.c.

Yuan

[1]

static inline TSNode ts_node__prev_sibling(TSNode self, bool include_anonymous) {
  Subtree self_subtree = ts_node__subtree(self);
  bool self_is_empty = ts_subtree_total_bytes(self_subtree) == 0;
  uint32_t target_end_byte = ts_node_end_byte(self);

  TSNode node = ts_node_parent(self);
  TSNode earlier_node = ts_node__null();
  bool earlier_node_is_relevant = false;

  while (!ts_node_is_null(node)) {
    TSNode earlier_child = ts_node__null();
    bool earlier_child_is_relevant = false;
    bool found_child_containing_target = false;

    TSNode child;
    NodeChildIterator iterator = ts_node_iterate_children(&node);
    while (ts_node_child_iterator_next(&iterator, &child)) {
      if (child.id == self.id) break;
      if (iterator.position.bytes > target_end_byte) {
        found_child_containing_target = true;
        break;
      }

      if (iterator.position.bytes == target_end_byte &&
          (!self_is_empty ||
           ts_subtree_has_trailing_empty_descendant(ts_node__subtree(child), self_subtree))) {
        found_child_containing_target = true;
        break;
      }

      if (ts_node__is_relevant(child, include_anonymous)) {
        earlier_child = child;
        earlier_child_is_relevant = true;
      } else if (ts_node__relevant_child_count(child, include_anonymous) > 0) {
        earlier_child = child;
        earlier_child_is_relevant = false;
      }
    }

    if (found_child_containing_target) {
      if (!ts_node_is_null(earlier_child)) {
        earlier_node = earlier_child;
        earlier_node_is_relevant = earlier_child_is_relevant;
      }
      node = child;
    } else if (earlier_child_is_relevant) {
      return earlier_child;
    } else if (!ts_node_is_null(earlier_child)) {
      node = earlier_child;
    } else if (earlier_node_is_relevant) {
      return earlier_node;
    } else {
      node = earlier_node;
    }
  }

  return ts_node__null();
}





This bug report was last modified 2 years and 238 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.