Package: emacs;
Reported by: Daniel Martín <mardani29 <at> yahoo.es>
Date: Wed, 22 Sep 2021 09:31:02 UTC
Severity: normal
Found in version 28.0.1
Message #14 received at 50733 <at> debbugs.gnu.org (full text, mbox):
From: Dmitry Gutov <dgutov <at> yandex.ru> To: Daniel Martín <mardani29 <at> yahoo.es> Cc: 50733 <at> debbugs.gnu.org Subject: Re: bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time Date: Thu, 23 Sep 2021 02:09:16 +0300
On 23.09.2021 00:58, Daniel Martín wrote: > Dmitry Gutov <dgutov <at> yandex.ru> writes: >> >> IIRC you are using macOS. I received another report recently that >> find/grep based tooling, and project-find-regexp in particular, are >> pretty slow on that OS. > > Yes, this is on macOS. > >> >> When you say "block for a long time", how long are we talking about? >>> >> To try it, evaluate >> >> (benchmark 1 '(project-find-regexp "new-collection")) > > I usually work on a monorepo with ~67000 tracked files (many of them big > binary files). Here's what I get when using ripgrep as the xref search > program: > > Elapsed time: 36.087181s (8.067474s in 22 GCs) Thanks for testing. Did the switch to ripgrep help much? I wonder if we should advertise this setting and recommendation more prominently, at least until we get auto-detection. > Running the same search with ripgrep from the command line takes around > 6 seconds. Is that with an SSD? Your project sounds respectable. The torvalds-linux repo I have checked out here is also 70000 files, but I guess your files are bigger. >> Another benchmark to try is >> >> (benchmark 1 '(project-files (project-current))) > > Elapsed time: 1.590223s (0.432372s in 1 GCs) That's a while (I wonder if you find 'project-find-file' usable with this kind of performance), but still better than I might have expected. > Here's an ELisp profile of the first benchmark: > > 8696 78% - command-execute > 8696 78% - call-interactively > 8493 76% - funcall-interactively > 8480 76% - eval-expression > 8479 76% - eval > 8479 76% - project-find-regexp > 8227 74% - xref--show-xrefs > 8227 74% - xref--show-xref-buffer > 5584 50% - #<compiled 0x140b5a40100bafc6> > 5584 50% - apply > 5584 50% - project--find-regexp-in-files > 5574 50% - xref-matches-in-files > 3016 27% - xref--convert-hits > 3000 27% - mapcan > 2992 27% - #<compiled -0x6cdcd56218925c3> > 2734 24% - xref--collect-matches > 2094 18% - xref--collect-matches-1 > 800 7% + xref-make-match > 774 7% + xref-make-file-location > 104 0% xref--find-file-buffer > 80 0% file-remote-p > 51 0% xref--regexp-syntax-dependent-p > 906 8% + xref--process-file-region > 331 2% sort > 1413 12% + xref--analyze > 1230 11% + xref--show-common-initialize > 249 2% + project-files > 3 0% + project-current > 9 0% + minibuffer-complete > 4 0% + execute-extended-command > 203 1% + byte-code > 2314 20% - ... > 2314 20% Automatic GC > 27 0% + timer-event-handler When you have a lot of matches, at some point Lisp overhead is going to show up. E.g., the searches seem almost instantaneous with up to several thousand matches here, but 10000s and 100000s - yeah, I have to wait. Help with optimizations in that area (around/in xref-matches-in-files and xref--convert-hits) is welcome, but I'm not sure how much more we can squeeze. > The search time is reduced when I use a more specific search term, > presumably because the number of results is lower and the Elisp > post-processing takes less time. Here's what I got, for example, when I > search for something with results from only one file: > > Elapsed time: 6.859815s (0.864738s in 2 GCs) > > Compared to the time taken by the same query from the command line > (6.5s) shows that the Elisp post-processing time is probably negligible > in this scenario. It's a good result. A little suspicious, though: given that project-find-regexp calls project-files first, and the latter takes 1.5s, the difference should ~ that time. But I guess rg also needs to traverse the directory tree, and spends some time on doing that too. What else can be done -- again, if someone wants to investigate an asynchronous/nonblocking API for Xref (or using threads) -- welcome. The case when most of the time is spent in the subprocess is a good match. But I don't think we'll manage this for the upcoming release. Another thing you can do is set up the additional ignores for the project. If those big binary files are not something you are interested in searching and touching, you could add ignore entries for them. When the vc project backend is in use (default), it is currently done via .dir-locals.el: the variable is project-vc-ignores, it's a list of strings that should be globs. See its docstring and the explanation in project-ignores's docstring. Note that ignores also affect project-find-file.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.