GNU bug report logs -
#47711
27.1; Deferred highlighting support in `completion-all-completions', `vertico--all-completions`
Previous Next
Reported by: Daniel Mendler <mail <at> daniel-mendler.de>
Date: Sun, 11 Apr 2021 20:52:01 UTC
Severity: normal
Found in version 27.1
Done: Daniel Mendler <mail <at> daniel-mendler.de>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Hi all!
Time flies, doesn't it?
On 14/08/2021 10:16, Eli Zaretskii wrote:
>>> If one removes these lines, the process becomes much faster, but there is a
>>> problem with highlighting. My idea is indeed to defer highlighting by not
>>> setting the 'face property directly on that shared string, but some
>>> other property
>>> that is read later from the shared string by compliant frontents.
>>
>> I haven't done any direct benchmarking, but I'm pretty sure that this
>> approach cannot, by definition, be as fast as the non-mutating one.
>
> Daniel seems to think otherwise, AFAIU.
>
>> Because you go through the whole list and mutate all of its elements,
>> you have to perform a certain bit of work W x N times, where N is the
>> size of the whole list.
>>
>> Whereas the deferred-mutation approach will only have to do its bit
>> (which is likely more work, like, WWW) only 20 times, or 100 times, or
>> however many completions are displayed. And this is usually negligible.
>>
>> However big the difference is going to be, I can't say in advance, of
>> course, or whether it's going to be shadowed by some other performance
>> costs. But the non-mutating approach should have the best optimization
>> potential when the list is long.
>
> So I guess the suggestion to have a benchmark is still useful, because
> the estimations of which approach has better performance vary between
> you three. So maybe producing such benchmarks would be a good step?
To cross this out from my TODO, I spent most of the day rebasing both of
the proposed patches (one of them longer than the other) -- one from an
attachment here and another from a commit inside the
scratch/icomplete-lazy-highlight-attempt-2 branch, porting icomplete to
Daniel's new completion-filter-completions API (*), and benchmarking.
(*) Included in the attached patch: it needed changing just two lines
inside icomplete, but also new variable completion-all-sorted-highlight
and updates to completion--cache-all-sorted-completions and
completion-all-sorted-completions.
Both rebased patches are attached to this email for your convenience.
AFAICT, the results confirmed my expectations quoted above.
Using Joao's benchmark, with setup:
(defmacro lotsoffunctions ()
`(progn
,@(cl-loop repeat 150000
collect `(defun ,(intern (symbol-name (gensym
"heyhey"))) () 42))))
(lotsoffunctions)
I ran the comparisons for empty and non-empty inputs.
With no characters typed:
(benchmark-run 1
(let ((completion-styles '(flex))
(completion-lazy-hilit (cl-gensym)) ; might not be defined
)
;; Uncomment one of the lines below, depending on the patch used.
;; (completion-all-completions "" obarray 'fboundp 0 nil)
;; (completion-filter-completions "" obarray 'fboundp 0 nil)
))
master => 0.066
lazy-hilit => 0.045
filter-and-defer => 0.041 (but more often ~0.110 including GC, somehow)
With one character typed:
(benchmark-run 1
(let ((completion-styles '(flex))
(completion-lazy-hilit (cl-gensym)) ; might not be defined
)
;; Uncomment one of the lines below, depending on the patch used.
;; (completion-all-completions "h" obarray 'fboundp 1 nil)
;; (completion-filter-completions "h" obarray 'fboundp 1 nil)
))
master => 0.824
lazy-hilit => 0.395
filter-and-defer => 0.125 (!)
This more or less translates into the improvement in speed of
fido-vertical-mode, according to my benchmark-progn call inside
icomplete-exhibit (included in both attached patches for convenience).
For non-empty inputs (h or hh or hhe, to match the generated functions),
filter-and-defer is about 1.5x faster than lazy-hilit, like 0.450ms vs
0.640ms.
lazy-hilit is slightly faster than filter-and-defer with the empty input
(like 380ms vs 420ms), and I'm not yet sure why, but it's the scenario
with 0 highlighting (and so no flex scoring/sorting). Perhaps some
short-circuiting can be added somewhere to reach parity, or it's the
cost of extra branching somewhere for backward compatibility (which
could be removed in the future). Worth additional study.
[0001-Add-new-completion-filter-completions-API-and-deferr-v3.diff (text/x-patch, attachment)]
[completion-lazy-hilit.patch (text/x-patch, attachment)]
This bug report was last modified 172 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.