GNU bug report logs - #50733
28.0.1; project-find-regexp can block Emacs for a long time

Previous Next

Package: emacs;

Reported by: Daniel Martín <mardani29 <at> yahoo.es>

Date: Wed, 22 Sep 2021 09:31:02 UTC

Severity: normal

Found in version 28.0.1

Full log


View this message in rfc822 format

From: Gregory Heytings <gregory <at> heytings.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 50733 <at> debbugs.gnu.org, mardani29 <at> yahoo.es, dgutov <at> yandex.ru
Subject: bug#50733: 28.0.1; project-find-regexp can block Emacs for a long time
Date: Fri, 24 Sep 2021 16:24:50 +0000
>> IMO, the one and only case where a specialized tool beats ripgrep (or 
>> just plain grep) is when you just want the place(s) where the 
>> identifier is defined.
>
> No, a specialized tool that uses a DB will scale much better than any 
> tool which searches the filesystem.  _And_ it will be more accurate (if 
> used correctly).
>

Sorry, but I simply don't believe this.  At least not for general regex 
searches.  I'd be interested to see some numbers to support your 
viewpoint.

>> That's not correct, mkid only supports a limited number of programming 
>> languages. And it's not even precise: rg O_CREAT on the Emacs trunk for 
>> example returns 45 matches, gid O_CREAT returns 33 matches.
>
> I'm sorry, but this has NIH written all over it.  Am I right guessing 
> that you aren't an active user of ID Utils, and perhaps didn't even know 
> about it before I mentioned it?
>

You are wrong; of course I knew about ID Utils.  I tried and compared it 
with ripgrep a few years ago, and concluded that it's a (far) less useful 
tool, at least for my purposes, for the reasons I mentioned: it works with 
a database that must be updated, which is slow, and it is not faster than 
ripgrep.

>
> More to the point: are you saying that a tool that returns more matches 
> is necessarily better?
>

The purpose of project-find-regexp is to find all matches.

>
> Look closer at those matches which gid "missed", and you will see why it 
> didn't show them to you.
>

I looked close (before you asked), and no, I don't see why some matches 
are not included.  For example it returns

lib/tempname.c:212: __GT_FILE: create the file using open(O_CREAT|O_EXCL)

but not

lib/tempname.h:47: GT_FILE: create a large file using open(O_CREAT|O_EXCL)

and it returns

lib/open.c:99: /* Fail if one of O_CREAT, O_WRONLY, O_RDWR is specified and the filename

but not

lisp/gnus/nnmaildir.el:387: ;; If Emacs had O_CREAT|O_EXCL, we could return number-open here.

>
> Oh, and if ripgrep finds only 45 matches, then something is wrong with 
> it, because there are actually no less than 119 literal matches for 
> O_CREAT in the tree (not counting many binary files that also match). So 
> by this measure, ripgrep is also not the right tool for the job.
>

No, there are exactly 45 matches of "O_CREAT" on a fresh clone of the 
trunk.

>> Five seconds to scan the whole Emacs trunk is IMO not fast enough 
>> (ripgrep does it in < 0.2 seconds).
>
> Those 5 sec are invested only when needed, while the time it takes 
> Grep/ripgrep to scan the files is invested every search.  Do this enough 
> times, and you paid too much time.
>

Did you read the numbers I mentioned earlier?  rg O_CREAT is as fast as 
gid O_CREAT.  And this is without regexes; rg O.*CREAT is three times 
faster than gid O.*CREAT.

>> And without incremental updates, updating the database would be 
>> necessary before each invocation of gid, because what users expect when 
>> they search for something are accurate results corresponding to the 
>> current state of the project, not results from, say, an hour ago.
>
> It is very easy to trigger a new mkid run from a file-notification that 
> watches the project tree, so that it runs in the background without the 
> user noticing when needed.  Puff! the problem's gone.
>

Your "very easy" solution is still, IMO, an unnecessary complexity, with 
little (if any) benefit.




This bug report was last modified 3 years and 261 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.