GNU bug report logs - #71179
[PATCH] In rgrep, check matching files before excluding files

Previous Next

Package: emacs;

Reported by: Spencer Baugh <sbaugh <at> janestreet.com>

Date: Fri, 24 May 2024 20:15:02 UTC

Severity: normal

Tags: patch

Fixed in version 30.1

Done: Stefan Kangas <stefankangas <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>, Spencer Baugh <sbaugh <at> janestreet.com>
Cc: 71179 <at> debbugs.gnu.org
Subject: bug#71179: [PATCH] In rgrep, check matching files before excluding files
Date: Sat, 25 May 2024 15:26:56 +0300
On 25/05/2024 09:36, Eli Zaretskii wrote:
>> From: Spencer Baugh <sbaugh <at> janestreet.com>
>> Date: Fri, 24 May 2024 16:14:39 -0400
>>
>> In my benchmarking, this takes (rgrep "foo" "*.el" "~/src/emacs/trunk/")
>> from ~410ms to ~130ms.
> 
> Which is a minor improvement at best, possibly a negligible one.  In
> my testing (on MS-Windows), I see a barely-tangible improvement: 0.7%.

That's unfortunate, but I think we prioritize GNU systems when making 
such decisions. I suppose filesystem access has more overhead on MSW, or 
there are other problems with the port.

>> Date: Fri, 24 May 2024 23:45:00 +0300
>> From: Dmitry Gutov <dmitry <at> gutov.dev>
>>
>>> In my benchmarking, this takes (rgrep "foo" "*.el" "~/src/emacs/trunk/")
>>> from ~410ms to ~130ms.
>>
>> I can confirm improvement here (though not exactly 3x).
>>
>> 1.9s to 1.3s in a Linux checkout, for example. Nice.
> 
> Which is still quite minor.

A 30% improvement is nothing to sneeze at, especially for a code change 
as simple as this one.

I've tried the "gecko-dev" checkout, and there the change is from 6s 
down to 1.9s when searching for .cpp and from 6s to 3.7s when searching 
for .js (the top #1 file type, 25% of files in that project are .js).

Naturally not all cases will see an improvement, but many will, and for 
example 'xref-find-references' also uses grep-find-template (by default) 
and specifies a list of file extensions - so it should also get faster.

>> Moving the files exclude instructions to the <F> placeholder is a slight
>> incompatibility
> 
> Right, and for that reason, we cannot install this change as-is.  We
> need either a different command or a user option controlling the order
> (with a good explanation of the effect of the difference).

A user option might work, but before we add one it would be great to 
understand who are the users that it is for.

>> A grep-find-template that doesn't include <X> will indeed start seeing
>> ignores based on grep-find-ignored-files in rgrep.  But, such a user can
>> just set grep-find-ignored-files to nil and then they'll stop seeing
>> ignores again.
> 
> That's not a valid argument for changing the default behavior.
> Because I could counter-argue that if you don't care about the order
> and want those few hundreds of millisecond at all costs, then _you_
> can customize the template to your liking, leaving the default
> behavior intact.

I don't think you can get the same effect just by editing the template.

>> Also, for what it's worth, note that the documentation for
>> grep-find-template says this:
>>
>>    <X> - find options to restrict or expand the directory list
>>    <F> - find options to limit the files matched
>>
>> So this change makes the documentation more accurate: <X> previously
>> also affected the files matched, but now it only affects the directory
>> list, as documented.  <F> continues to limit the files matched, as
>> before.
> 
> Sorry, such incompatible changes are not acceptable, definitely when
> the gain is so small.  Correctness trumps speed.

Can you think of a specific problematic usage? The way I see it, 
grep-find-template is not really portable between different programs: 
supporting <X> in the format we're passing to it basically requires 
'find' to be used (there are no compatible alternatives). That would 
mean that passing the same arguments to <F> should work fine.




This bug report was last modified 1 year and 40 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.