GNU bug report logs - #60054
29.0.60; Infinite loop when there are cyclic path in the parse tree

Previous Next

Package: emacs;

Reported by: Yuan Fu <casouri <at> gmail.com>

Date: Wed, 14 Dec 2022 00:12:02 UTC

Severity: normal

Found in version 29.0.60

Done: Yuan Fu <casouri <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 60054 in the body.
You can then email your comments to 60054 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#60054; Package emacs. (Wed, 14 Dec 2022 00:12:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Yuan Fu <casouri <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Wed, 14 Dec 2022 00:12:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Bug Report Emacs <bug-gnu-emacs <at> gnu.org>
Subject: 29.0.60; Infinite loop when there are cyclic path in the parse tree
Date: Tue, 13 Dec 2022 16:11:01 -0800
This is not really an Emacs bug, but either tree-siter-c or
tree-sitter’s. I’m putting it out here so that if I’m hit by a bus
tomorrow, and treesit-search-forward-goto and friends hang, 
we (eh, you) know what’s going on.

I’ve submitted an issue here:
https://github.com/tree-sitter/tree-sitter-c/issues/119

So far, I’ve only observed this in that specific edge case.

Yuan



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60054; Package emacs. (Wed, 14 Dec 2022 12:09:02 GMT) Full text and rfc822 format available.

Message #8 received at 60054 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: 60054 <at> debbugs.gnu.org
Subject: Re: bug#60054: 29.0.60;
 Infinite loop when there are cyclic path in the parse tree
Date: Wed, 14 Dec 2022 14:08:23 +0200
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Tue, 13 Dec 2022 16:11:01 -0800
> 
> 
> This is not really an Emacs bug, but either tree-siter-c or
> tree-sitter’s. I’m putting it out here so that if I’m hit by a bus
> tomorrow, and treesit-search-forward-goto and friends hang, 
> we (eh, you) know what’s going on.
> 
> I’ve submitted an issue here:
> https://github.com/tree-sitter/tree-sitter-c/issues/119
> 
> So far, I’ve only observed this in that specific edge case.

We should have protection against that, which should be easy, right?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60054; Package emacs. (Wed, 14 Dec 2022 20:29:02 GMT) Full text and rfc822 format available.

Message #11 received at 60054 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 60054 <at> debbugs.gnu.org
Subject: Re: bug#60054: 29.0.60; Infinite loop when there are cyclic path in
 the parse tree
Date: Wed, 14 Dec 2022 12:27:58 -0800

> On Dec 14, 2022, at 4:08 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Tue, 13 Dec 2022 16:11:01 -0800
>> 
>> 
>> This is not really an Emacs bug, but either tree-siter-c or
>> tree-sitter’s. I’m putting it out here so that if I’m hit by a bus
>> tomorrow, and treesit-search-forward-goto and friends hang, 
>> we (eh, you) know what’s going on.
>> 
>> I’ve submitted an issue here:
>> https://github.com/tree-sitter/tree-sitter-c/issues/119
>> 
>> So far, I’ve only observed this in that specific edge case.
> 
> We should have protection against that, which should be easy, right?

Just to make sure, we want to use something like slow-fast pointers, where we have two pointers, and one goes twice as fast, right? That’s the one I was taught in school :-)

The author advices to use cursors for traversing the tree, as cursors doesn’t have this bug. He also advised against using ts_node_parent and said that they could be removed in the future. I didn’t use cursors in the first place because they can’t traverse the tree backwards, ie, no equivalent of ts_node_prev_sibling, and the performance difference is not significant in Emacs settings. But I just went to look at the source, and it seems ts_node_prev_siblings(node) is implemented by just iterating each children from first to last, until it finds the child just before NODE, LOL[1]. I can do similar things in treesit.c with cursors. By doing that we can fix this problem and be future-proof.

In summary, I’m proposing: 
1. I add the slow-fast pointer checks in treesit.c and treesit.el
2. I replace ts_node_parent/sibling/child with using cursors in tree-traversal functions in treesit.c.

Yuan

[1]

static inline TSNode ts_node__prev_sibling(TSNode self, bool include_anonymous) {
  Subtree self_subtree = ts_node__subtree(self);
  bool self_is_empty = ts_subtree_total_bytes(self_subtree) == 0;
  uint32_t target_end_byte = ts_node_end_byte(self);

  TSNode node = ts_node_parent(self);
  TSNode earlier_node = ts_node__null();
  bool earlier_node_is_relevant = false;

  while (!ts_node_is_null(node)) {
    TSNode earlier_child = ts_node__null();
    bool earlier_child_is_relevant = false;
    bool found_child_containing_target = false;

    TSNode child;
    NodeChildIterator iterator = ts_node_iterate_children(&node);
    while (ts_node_child_iterator_next(&iterator, &child)) {
      if (child.id == self.id) break;
      if (iterator.position.bytes > target_end_byte) {
        found_child_containing_target = true;
        break;
      }

      if (iterator.position.bytes == target_end_byte &&
          (!self_is_empty ||
           ts_subtree_has_trailing_empty_descendant(ts_node__subtree(child), self_subtree))) {
        found_child_containing_target = true;
        break;
      }

      if (ts_node__is_relevant(child, include_anonymous)) {
        earlier_child = child;
        earlier_child_is_relevant = true;
      } else if (ts_node__relevant_child_count(child, include_anonymous) > 0) {
        earlier_child = child;
        earlier_child_is_relevant = false;
      }
    }

    if (found_child_containing_target) {
      if (!ts_node_is_null(earlier_child)) {
        earlier_node = earlier_child;
        earlier_node_is_relevant = earlier_child_is_relevant;
      }
      node = child;
    } else if (earlier_child_is_relevant) {
      return earlier_child;
    } else if (!ts_node_is_null(earlier_child)) {
      node = earlier_child;
    } else if (earlier_node_is_relevant) {
      return earlier_node;
    } else {
      node = earlier_node;
    }
  }

  return ts_node__null();
}





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60054; Package emacs. (Thu, 15 Dec 2022 06:17:02 GMT) Full text and rfc822 format available.

Message #14 received at 60054 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: 60054 <at> debbugs.gnu.org
Subject: Re: bug#60054: 29.0.60; Infinite loop when there are cyclic path in
 the parse tree
Date: Thu, 15 Dec 2022 08:16:39 +0200
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Wed, 14 Dec 2022 12:27:58 -0800
> Cc: 60054 <at> debbugs.gnu.org
> 
> >> https://github.com/tree-sitter/tree-sitter-c/issues/119
> >> 
> >> So far, I’ve only observed this in that specific edge case.
> > 
> > We should have protection against that, which should be easy, right?
> 
> Just to make sure, we want to use something like slow-fast pointers, where we have two pointers, and one goes twice as fast, right? That’s the one I was taught in school :-)

No, I mean protect us from inflooping by checking that the parent of a
node is not the node itself.

> In summary, I’m proposing: 
> 1. I add the slow-fast pointer checks in treesit.c and treesit.el
> 2. I replace ts_node_parent/sibling/child with using cursors in tree-traversal functions in treesit.c.

SGTM, thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60054; Package emacs. (Fri, 16 Dec 2022 01:15:02 GMT) Full text and rfc822 format available.

Message #17 received at 60054 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 60054 <at> debbugs.gnu.org
Subject: Re: bug#60054: 29.0.60; Infinite loop when there are cyclic path in 
 the parse tree
Date: Thu, 15 Dec 2022 17:14:36 -0800
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Wed, 14 Dec 2022 12:27:58 -0800
>> Cc: 60054 <at> debbugs.gnu.org
>> 
>> >> https://github.com/tree-sitter/tree-sitter-c/issues/119
>> >> 
>> >> So far, I’ve only observed this in that specific edge case.
>> > 
>> > We should have protection against that, which should be easy, right?
>> 
>> Just to make sure, we want to use something like slow-fast pointers,
>> where we have two pointers, and one goes twice as fast, right?
>> That’s the one I was taught in school :-)
>
> No, I mean protect us from inflooping by checking that the parent of a
> node is not the node itself.

In this particular case, it is the siblings’ parent that equals to the
node. Ie, node->sibling->parent = node.  If your intention is to protect
us from this particular case, switching to use cursors will avoid this
bug.

Yuan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60054; Package emacs. (Sat, 17 Dec 2022 23:29:02 GMT) Full text and rfc822 format available.

Message #20 received at 60054 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 60054 <at> debbugs.gnu.org
Subject: Re: bug#60054: 29.0.60; Infinite loop when there are cyclic path in 
 the parse tree
Date: Sat, 17 Dec 2022 15:28:01 -0800
Yuan Fu <casouri <at> gmail.com> writes:

> Eli Zaretskii <eliz <at> gnu.org> writes:
>
>>> From: Yuan Fu <casouri <at> gmail.com>
>>> Date: Wed, 14 Dec 2022 12:27:58 -0800
>>> Cc: 60054 <at> debbugs.gnu.org
>>> 
>>> >> https://github.com/tree-sitter/tree-sitter-c/issues/119
>>> >> 
>>> >> So far, I’ve only observed this in that specific edge case.
>>> > 
>>> > We should have protection against that, which should be easy, right?
>>> 
>>> Just to make sure, we want to use something like slow-fast pointers,
>>> where we have two pointers, and one goes twice as fast, right?
>>> That’s the one I was taught in school :-)
>>
>> No, I mean protect us from inflooping by checking that the parent of a
>> node is not the node itself.
>
> In this particular case, it is the siblings’ parent that equals to the
> node. Ie, node->sibling->parent = node.  If your intention is to protect
> us from this particular case, switching to use cursors will avoid this
> bug.

Ok, I made the change to use cursor API with tests. Hopefully this is
the last time we need to change treesit.c before release. The
node->sibling->parent = node cyclic path should be fixed by this change,
do you still want checks for it?

Yuan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#60054; Package emacs. (Sun, 18 Dec 2022 06:01:02 GMT) Full text and rfc822 format available.

Message #23 received at 60054 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: 60054 <at> debbugs.gnu.org
Subject: Re: bug#60054: 29.0.60; Infinite loop when there are cyclic path in 
 the parse tree
Date: Sun, 18 Dec 2022 08:00:39 +0200
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Sat, 17 Dec 2022 15:28:01 -0800
> Cc: 60054 <at> debbugs.gnu.org
> 
> > In this particular case, it is the siblings’ parent that equals to the
> > node. Ie, node->sibling->parent = node.  If your intention is to protect
> > us from this particular case, switching to use cursors will avoid this
> > bug.
> 
> Ok, I made the change to use cursor API with tests. Hopefully this is
> the last time we need to change treesit.c before release.

This broke the Windows build (I fixed it).  You cannot start using new
tree-sitter functions without adding the boilerplate code for loading
them dynamically from the shared library at run time.

> The node->sibling->parent = node cyclic path should be fixed by this
> change, do you still want checks for it?

If that problem can never happen, there's no need for the checks.

Thanks.




Reply sent to Yuan Fu <casouri <at> gmail.com>:
You have taken responsibility. (Sun, 18 Dec 2022 08:11:01 GMT) Full text and rfc822 format available.

Notification sent to Yuan Fu <casouri <at> gmail.com>:
bug acknowledged by developer. (Sun, 18 Dec 2022 08:11:02 GMT) Full text and rfc822 format available.

Message #28 received at 60054-done <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 60054-done <at> debbugs.gnu.org
Subject: Re: bug#60054: 29.0.60; Infinite loop when there are cyclic path in 
 the parse tree
Date: Sun, 18 Dec 2022 00:10:23 -0800
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Sat, 17 Dec 2022 15:28:01 -0800
>> Cc: 60054 <at> debbugs.gnu.org
>> 
>> > In this particular case, it is the siblings’ parent that equals to the
>> > node. Ie, node->sibling->parent = node.  If your intention is to protect
>> > us from this particular case, switching to use cursors will avoid this
>> > bug.
>> 
>> Ok, I made the change to use cursor API with tests. Hopefully this is
>> the last time we need to change treesit.c before release.
>
> This broke the Windows build (I fixed it).  You cannot start using new
> tree-sitter functions without adding the boilerplate code for loading
> them dynamically from the shared library at run time.

Ah right, it evaded my mind, sorry about that.

>> The node->sibling->parent = node cyclic path should be fixed by this
>> change, do you still want checks for it?
>
> If that problem can never happen, there's no need for the checks.

Cool. I’m closing this.

Yuan




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 15 Jan 2023 12:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 238 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.