GNU bug report logs - #47711
27.1; Deferred highlighting support in `completion-all-completions', `vertico--all-completions`

Previous Next

Package: emacs;

Reported by: Daniel Mendler <mail <at> daniel-mendler.de>

Date: Sun, 11 Apr 2021 20:52:01 UTC

Severity: normal

Found in version 27.1

Done: Daniel Mendler <mail <at> daniel-mendler.de>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>, Daniel Mendler <mail <at> daniel-mendler.de>, Stefan Monnier <monnier <at> IRO.UMontreal.CA>, João Távora <joaotavora <at> gmail.com>
Cc: 47711 <at> debbugs.gnu.org
Subject: bug#47711: bug#48841: bug#47711: bug#48841: bug#47711: [PATCH VERSION 2] Add new `completion-filter-completions` API and deferred highlighting
Date: Wed, 25 Oct 2023 01:25:23 +0300
[Message part 1 (text/plain, inline)]
Hi all!

Time flies, doesn't it?

On 14/08/2021 10:16, Eli Zaretskii wrote:
>>> If one removes these lines, the process becomes much faster, but there is a
>>> problem with highlighting.  My idea is indeed to defer highlighting by not
>>> setting the 'face property directly on that shared string, but some
>>> other property
>>> that is read later from the shared string by compliant frontents.
>>
>> I haven't done any direct benchmarking, but I'm pretty sure that this
>> approach cannot, by definition, be as fast as the non-mutating one.
> 
> Daniel seems to think otherwise, AFAIU.
> 
>> Because you go through the whole list and mutate all of its elements,
>> you have to perform a certain bit of work W x N times, where N is the
>> size of the whole list.
>>
>> Whereas the deferred-mutation approach will only have to do its bit
>> (which is likely more work, like, WWW) only 20 times, or 100 times, or
>> however many completions are displayed. And this is usually negligible.
>>
>> However big the difference is going to be, I can't say in advance, of
>> course, or whether it's going to be shadowed by some other performance
>> costs. But the non-mutating approach should have the best optimization
>> potential when the list is long.
> 
> So I guess the suggestion to have a benchmark is still useful, because
> the estimations of which approach has better performance vary between
> you three.  So maybe producing such benchmarks would be a good step?

To cross this out from my TODO, I spent most of the day rebasing both of 
the proposed patches (one of them longer than the other) -- one from an 
attachment here and another from a commit inside the 
scratch/icomplete-lazy-highlight-attempt-2 branch, porting icomplete to 
Daniel's new completion-filter-completions API (*), and benchmarking.

(*) Included in the attached patch: it needed changing just two lines 
inside icomplete, but also new variable completion-all-sorted-highlight 
and updates to completion--cache-all-sorted-completions and 
completion-all-sorted-completions.

Both rebased patches are attached to this email for your convenience.

AFAICT, the results confirmed my expectations quoted above.

Using Joao's benchmark, with setup:

  (defmacro lotsoffunctions ()
    `(progn
       ,@(cl-loop repeat 150000
                  collect `(defun ,(intern (symbol-name (gensym 
"heyhey"))) () 42))))

  (lotsoffunctions)

I ran the comparisons for empty and non-empty inputs.

With no characters typed:

  (benchmark-run 1
    (let ((completion-styles '(flex))
          (completion-lazy-hilit (cl-gensym)) ; might not be defined
          )
      ;; Uncomment one of the lines below, depending on the patch used.
      ;; (completion-all-completions "" obarray 'fboundp 0 nil)
      ;; (completion-filter-completions "" obarray 'fboundp 0 nil)
      ))

master => 0.066
lazy-hilit => 0.045
filter-and-defer => 0.041 (but more often ~0.110 including GC, somehow)

With one character typed:

  (benchmark-run 1
    (let ((completion-styles '(flex))
          (completion-lazy-hilit (cl-gensym)) ; might not be defined
          )
      ;; Uncomment one of the lines below, depending on the patch used.
      ;; (completion-all-completions "h" obarray 'fboundp 1 nil)
      ;; (completion-filter-completions "h" obarray 'fboundp 1 nil)
      ))

master => 0.824
lazy-hilit => 0.395
filter-and-defer => 0.125 (!)

This more or less translates into the improvement in speed of 
fido-vertical-mode, according to my benchmark-progn call inside 
icomplete-exhibit (included in both attached patches for convenience). 
For non-empty inputs (h or hh or hhe, to match the generated functions), 
filter-and-defer is about 1.5x faster than lazy-hilit, like 0.450ms vs 
0.640ms.

lazy-hilit is slightly faster than filter-and-defer with the empty input 
(like 380ms vs 420ms), and I'm not yet sure why, but it's the scenario 
with 0 highlighting (and so no flex scoring/sorting). Perhaps some 
short-circuiting can be added somewhere to reach parity, or it's the 
cost of extra branching somewhere for backward compatibility (which 
could be removed in the future). Worth additional study.
[0001-Add-new-completion-filter-completions-API-and-deferr-v3.diff (text/x-patch, attachment)]
[completion-lazy-hilit.patch (text/x-patch, attachment)]

This bug report was last modified 172 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.