GNU bug report logs - #58158
29.0.50; [overlay] Interval tree iteration considered harmful

Previous Next

Package: emacs;

Reported by: Gerd Möllmann <gerd.moellmann <at> gmail.com>

Date: Thu, 29 Sep 2022 05:30:02 UTC

Severity: normal

Found in version 29.0.50

Fixed in version 30.1

Done: Gerd Möllmann <gerd.moellmann <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #50 received at 58158 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: gerd.moellmann <at> gmail.com, 58158 <at> debbugs.gnu.org
Subject: Re: bug#58158: 29.0.50; [overlay] Interval tree iteration
 considered harmful
Date: Thu, 29 Sep 2022 12:48:34 -0400
>> > I may be missing something, but it looks like the sole purpose of the
>> > iter_start/iter_finish dance is to ensure only one iteration per tree
>> > is running at any given time, and that's because the iteration uses
>> > some state variable(s) of which there's only one instance per tree.
>> >
>> > Stefan, am I missing something?
>> 
>> One reason is that traversing a binary tree usually requires something
>> like recursion, but that wouldn't fit very conveniently with the current
>> code (nor with C in general since you can't make a local recursive
>> closure which accesses local variables from the surrounding function).
>
> I'm not sure I understand how recursion is related.  Are you saying
> that recursion is replaced with iteration?  But then, if _start and
> _finish are called by the same caller, we don't really need the
> protection, since no one can start another iteration while the first
> one runs.  Right?

Typically, current code will look something like:

    int x;
    Lisp_Object y;

    buffer_overlay_iter_start (current_buffer, prev, pos, ITREE_DESCENDING);
    while ((node = buffer_overlay_iter_next (current_buffer)))
      {
        ... do something that updates x and y ...
      }
    buffer_overlay_iter_finish (current_buffer);

If we were to use recursion, then we'd need to define a new (recursive)
function which does what's currently done in the `while` loop, but this
function can't access `x` and `y`, so it would need to take them as
argument, or a reference to them...

The use of an iterator is definitely convenient (and is a standard
approach in many languages).

>> Another is the need to update the begin/end fields (these need updating
>> because of insertions/deletions but they're updated lazily while
>> traversing the tree to avoid an O(N) complexity during the
>> insertions/deletions).  Hiding that behind 'some kind of "next node"
>> function keeps the code more readable.
>
> Where in the code do you see iteration that adds or deletes nodes?

No, I meant insertion/deletion of text in the buffer, thus requiring
updates to `begin/end` fields.

> Btw, couldn't we handle this by having a flag in the node that tells
> us whether the begin/end fields can be trusted?  Then the first caller
> who need them would run the update and reset the flags, and we still
> have lazy update, albeit somewhat less lazy, but without the need for
> guarding the iteration with start/finish calls.  Would that work?

Yes, it would, but it's still O(N).
The current approach is inspired by the approach used in `intervals.c`
which also updates those fields lazily/ondemand so as to avoid the O(N)
performance impact.

>> For now, I pushed a simple fix to traverse the tree "by hand" in the GC
>> rather than via the iterator.
> So this removes the restriction of not having GC during iteration?

Yes.

> Also, I take it that you don't consider the current code is as
> "harmful" as Gerd thinks it is?

I don't like the global state it uses, but I think we can fix this
aspect without too much trouble.

> IOW, you don't share his opinion that this implementation is
> a "no-go"?

No, indeed, I don't.


        Stefan





This bug report was last modified 1 year and 311 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.