GNU bug report logs -
#76538
31.0.50; 31.0.50; 31.0.50; feature/igc: using magit-section-cycle-global (S-TAB) and magit-section-toggle (TAB) in some random ways blocks GNU Emacs.
Previous Next
Full log
View this message in rfc822 format
[ See the end for a possible explanation and solution to the slowdown
we see on the igc branch. ]
>> FWIW, I have a branch `scratch/markers-as-gap-array` where I replace the
>> linked list of markers with an "array with a gap" where markers are
>> sorted, so we use a binary search. It might help in your case.
>
> Maybe not on feature/igc (this bug is about igc). It made changes to the
> way markers are stored in buffer and your branch might need to be
> adapted to work with the branch.
Definitely.
Not sure how to proceed, tho.
I kind of like the idea of keeping "separate" development on
separate branches. But indeed, since the igc branch already changed the
implementation of markers, maybe I should rebase my branch on top of igc?
>> It was pretty close to be "ready to merge" at some point, except that
>> I had trouble convincing myself that it was a clear win: in my tests, it
>> made `save-excursion` slower (the current naïve data-structure is ideal
>> for most "trivial" uses of `save-excursion`: we create a marker,
>> push it onto the head of the list, do a bit of unrelated work in the
>> body and then come back and just pop the marker that's still at the head
>> of the list) and the cases where it makes the code faster are hard to
>> come by, other than artificial micro-benchmarks.
>
> FYI, feature/igc is also using an array for markers. So, your approach
> might not be as bad here.
In the igc branch, both adding a marker to a buffer and removing it from
a buffer are very fast O(1) operations (even better than the code on
`master`, for the case of removal), whereas with my sorted-array-with-gap
I need O(log N) time for each of those. 🙁
I can probably improve the performance a bit by taking advantage of
locality, but even if I can reduce the average number of iterations
through the loop, there is still the fact that the current code doesn't
need to iterate at all.
>> Indeed those cache-markers are referenced only weakly so they currently
>> disappear at every GC. I can't remember if I kept this behavior in my
>> branch, but I remember considering some alternative where we keep every
>> other weak marker.
> It may or may not be the case on feature/igc. I think it remains to be
> the case, and, unlike master, GC on feature/igc may happen at arbitrary
> time rather than at specific code points. So, it may clear the cache in
> the middle of the search (AFAIU).
With my sorted-array-with-gap we could be much more sophisticated about
the handling of cache-markers:
- The data-structure can much more easily handle a large number of
markers (even during things like insertion/deletion the updates are
significantly cheaper), so there is less pressure the get rid of
cache markers.
- Because the array is sorted we can easily estimate the usefulness
of a given cache-marker (by looking at its distance to the next marker).
- We could keep a running cost/benefit analysis for each cache marker,
which gets incremented when we need to update the position
and decremented when we use the marker for byte<->char conversion.
- In the current system, once we have "too many markers" (and that
happens pretty quickly) the byte<->char conversion code may fail to
find the nearest marker, simply because it's too far from the top, so
we end up creating yet more cache-markers. If we're not careful to
throw away the cache-markers from the end of the linked-list, this
could quickly lead to absurdly large numbers of (cache-)markers.
BTW, now that I look at the code, maybe the problem you're bumping into
is that on the igc branch, (cache-)markers are added to the rear of the
array (as a first approximation), whereas on `master` they're added to
the front of the linked list.
In `buf_charpos_to_bytepos` we scan the array/list starting from
the front, so once the array gets large we never reach the end of the
array where we would find the recently-created nearby cache-marker, so
instead of using that nearby (cache-)marker we end up creating a new one
and paying (again) for a costly long scan.
Maybe if we change the `DO_MARKERS` iteration to scan from the end,
we'd recover a behavior much closer to what we see on `master`.
Gerd?
Stefan
This bug report was last modified 105 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.